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Abstract 

We discuss a new notion of distance on the space of finite and nonnegative measures 
on P C which we call Hellinger-Kantorovich distance. It can be seen as an inf- 
convolution of the well-known Kantorovich-Wasserstein distance and the Hellinger- 
Kakutani distance. The new distance is based on a dynamical formulation given 
by an Onsager operator that is the sum of a Wasserstein diffusion part and an 
additional reaction part describing the generation and absorption of mass. 

We present a full characterization of the distance and some of its properties. 

In particular, the distance can be equivalently described by an optimal transport 
problem on the cone space over the underlying space P. We give a construction of 
geodesic curves and discuss examples and their general properties. 

1 Introduction 

Starting from the pioneering works |,IK097j and |,IK098] , the reinterpretation of certain 
scalar diffusion equations as so-called Wasserstein gradient flows led to new analytic tools 
and concepts and gave deeper insight into diffusion problems, see e.g. |Viin9j and |AD*11] . 
In particular, in connection with suitable convexity properties of the driving functional 
the abstract theory of gradient flows in metric space developed in |AGS05j provides a 
sound and comprehensive geometric framework for these evolution equations. 

The recent reformulation of classes of react ion-diffusion systems as gradient systems, 
see |Mielll fMield^ lLiM13j . raises the question whether the abstract metric theory can be 
also developed for this wider class of problems. 

Following |Miellj we understand a gradient system as a triple (X, T, IK) consisting of 
a state space X, a driving functional T, and an Onsager operator IK. The latter means 
that IK is a state-dependent, symmetric, and positive semidehnite linear operator. In 
many cases the Onsager operator IK induces a dissipation distance Dk on the state space 
X by minimizing an action functional over all curves connecting two states. Now, the 
development of a metric theory rests upon the ability to characterize this distance and its 
properties. 

*Weierstra6-Institut fiir Angewandte Analysis und Stochastik, Berlin. 
iHumboldt-Universitat zu Berlin. 
tUniversita di Pavia. 
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This paper together with the companion paper |LMS15j provides rigorous characteri¬ 
zation of such a dissipation distance. It is based on the simple Onsager operator 


^a,l3{u)^ = —adlY^uV^) + (3u^, 


where a, /3 > 0 are fixed parameters. Obviously, the Onsager operator = aKwass + 
/SIKc-a is a sum of a Wasserstein part for diffusion and a creation-annihilation part, which 
is the simplest case of a reaction term. For the latter part it is not difficult to develop a 
corresponding analog to the Wasserstein distance W. For this, we simply note that IK^.a 
is the inverse of a metric tensor, such that the formal associated Riemannian structure is 
given by u I—)■ j{(3u) dx. Thus, it is easy to see that the Riemannian distance induced 

by ]Ko ,/3 is a multiple of the Hellinger-Kakutani distance H, see |Hein9[ lKak48] and |Sch96] . 
where also the correct function spaces are discussed. In particular, for measures of the 
form /Xj = fj dx we obtain 


Do,/3(ho,Fi) = with H(/io,/ii)^ = j - VTi) dx. 

Of course, this distance generalizes to the space of finite, nonnegative Borel measures, 
denoted by M(r2), see Section 2.3 In the following we shall always assume that the 
domain O C is convex and compact. 

We will show that IKq,^^ generates a proper distance on M(r2), which is formally 
given in a generalized Benamou-Brenier formulation (see jBeBnnj h i.e., by minimizing 
over all sufficiently smooth curves s i—)■ /i(s) connecting measures /tq and /ii, viz. 

Da,/3(/io,hi)^ = inf I ^ j [a|V^P+ d/i(s) ds 

+ a div(/iVO = /^o hi } • 


This characterization also works for reaction-diffusion systems, see |LiM131 Sect. 2(e)]. 

We call Da ^/3 the Hellinger-Kantorovich distance, since it can be understood as an 
inf-convolution (weighted by a and (3) of the Kantorovich-Wasserstein distance W and 
the Bellinger distance H. In particular, geodesic curves for the distance will optimize 
the usage of transport against the usage of creation or annihilation. As an outcome of 
our theory we will hnd that transport never occurs over distances longer than /3. 

For the rest of this introduction we will use the special choice a = 1 and /3 = 4, which 
simplihes the notation considerably. 

To give a full characterization of Di^, we go a detour which will highlight the un¬ 
derlying geometry of the distance much better. Motivated by an explicit formula for the 
distance between two Dirac measures, we introduce the cone space over D and define 
the Hellinger-Kantorovich distance l-K(/io,/ii) by lifting the measures jXj to measures \j 
on the cone and then minimizing the Wasserstein distance induced by a suitable cone 
distance on Cq. It is then easy to show that hK is indeed a geodesic distance, since is 
a geodesic distance. It is the purpose of Section]^ to show that D 14 indeed equals hK. 

For this proof, we will rely on a third characterization of the Hellinger-Kantorovich 
distance, which is given in terms of the entropy-transport functional for calibration mea¬ 
sures r] G M(r 2 xr 2 ) given via 

^^i,4(h;/^o,/^i) := [ ^B(^)d/ro+ [ FB(^)d/ri-h f Ci,4(|xo-Xi|)d?7, 
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Ci,4(^) := 


where Fb{z) = z log z — z + 1^ rji = H^^r] denote the usual marginals, and the cost function 
Cl ^4 is given by 

-2 log (cos L) for L < 7 r/ 2 , 
oo for L > 7 r/ 2 . 

Since /ioi hi) is convex, it is easy to hud minimizers, see |LMS15j for more details 

and the proof that H<(/io 5 hi)^ = min{ 4 ( 77 ; /xq, hi) I V ^ x f 2 ) }. 

To be more specihc, we return to the question of computing the distance between to 
Dirac masses pj = ajSy- with yj G D and aj > 0. Looking at connecting curves of the 
form /i(s) = a{s)6x(s) we can indeed minimize the length of these one-mass point curves 
(Imp) and hnd the result 




ao + ai- 2y/a^cos{\yi-yo\) for |hi-ho| < tt, 
ao + ai + 2 ^aoai for |?/i-i/o| > tt. 


( 1 . 1 ) 


In fact, a minimizer exists only for |hi~ho| < where for |hi~ho| ^ ^ the value 
D^^P{ao6yg,ai6y^) is an inhmum only. However, it will turn out that these curves are 
only optimal for \yi—yo\ < vr/ 2 , while for \yi—yo\ > t^/2, the two-mass point curve 
/i(s) = (1—s)^ao5yo -l- is shorter, since its squared length is Oq -I- Oi. Thus, creation 

and annihilation is better than transport in this case. 


Moreover, the formula in (1.1) suggest to introduce a cone distance on the cone 


over D given by the elements [x, r] for r > 0 and the tip 0 which is an identihcation of 
{ [x, 0] I X G D }. The cone distance is dehned as 

dc([xo,ro], [xi,ri])^ := To - 1 -- 2 roricos,r(ki-Xo|) with cos^a = cos (min{|a|, 6 }), 

see |BBI011 Sect. 3.6.2]. This distance is again a geodesic distance and we can dehne the 


associated Wasserstein distance Wg;, see Section 3.2.2 


Based on this observation we can now lift measures /x on D to measures A on such 
that fi = ^A, where the projection ^ : M 2 ((tQ) —)■ M(f 2 ) is dehned via 


/ 0(x)d(^A)(x) = / r^0(x)dA([x,r]) for all 0 G C°(D). 

Jq J 

Now, the hrst dehnition of the Hellinger-Kantorovich distance is 
l-K(/io, ^1) ^ min I Wc(Ao, Ai) 41-^0 ^ ^o. llAi ^ 


( 1 . 2 ) 


To further analyze this construction, one needs to study the optimality conditions for the 


lifts, which can be done by exploiting the characterization via ^^ 1 , 4 , see Theorem 3.6 and 
Section 3.3.3[ where the crucial duality theory is taken from |LMS15j . 


In Section we hnally show the identity Di _4 = hK by a full characterization of all 

This is 


absolutely continuous curves with respect to the distance hK, see Theorem 4.5 


done by lifting curves in M(f 2 ) to curves in M 2 (£!ri) and using a characterization of abso¬ 
lutely continuous curves with respect to which can be found in |Lis07j . In Corollary 


4.4 we obtain the important result that all geodesics curves in (M(D), hK) are obtained 

y{s) = ^A(s), where A : [0,1] -)■ M 2 (£o) 


as projections 
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is a geodesic curve in (M 2 (Cr 2 ), W^) connecting optimal lifts Aq and Ai in (1.2). Through¬ 
out this work, the notion “geodesic curve”, or shortly “geodesic”, means constant-speed 
minimal geodesic, viz. 


hK(/i(s), /i(t)) = \s—t\ l-K(/i(0), /i(l)) for all s, f G [0,1]. 

Section is devoted to various examples for geodesic curves, which are obtained by 
doing optimal lifts to the cone space and then constructing geodesic curves for the 
Wasserstein distance and projecting them down. Since the geodesic curves on the cone 
are explicit, this provides an explicit formula for geodesic curves /x : [0,1] —)■ M(r2), as 
soon as the lifts are specified. 

In particular, using this explicit construction we show that the total mass m(s) = 
fi{s){Q) along geodesic curves is 2-convex and 2-concave since we have the identity 

m{s) = (1—s)m(O) -f sm(l) — s(l—s)l-K(/io, (1-3) 


We discuss geodesic A-convexity of some functionals, in particular, we show that the 
linear functional T(/x) = <h(x)d/i(x) is geodesically A-convex if and only if the function 

[x,r] I—)■ r^<h(a;) is geodesically A-convex in (Cnidc)- 

It is also worth to note that the unique geodesic connecting /ii to the null measure 
/io = 0, which has the lifts aSo for a > 0, is done by the unique Hellinger geodesic 

/i^(s) = S^/ii. 


This simple observation immediately shows that the logarithmic entropy given by £(/x) = 
FB(M(a:))dx for /i = udx is not geodesically A-convex, since 

d{s^lJi) = log(s^)/u(Q) + 1—s^. 


In Section 5^ we reconsider our standard example of the geodesic connections of Dirac 
masses juj = aj6y.. It turns out that in the critical case \yo—yi\ = 7r/2 there is an infinite 
dimensional convex set of geodesic curves that can be constructed by showing that there 
are many optimal lifts to the cone space 

Section provides a generalization of the classical dilation of measures in the Wasser¬ 
stein case. For the Hellinger-Kantorovich distance there is a similar dilation where the 
mass inside the ball {x \ \x—yo\ < vr/2} is radially transported and partly annihilated 
into the point x/o while the mass at larger distance is simply annihilated according to the 
Hellinger distance. 

In Section 5.4 we show how the transport of two characteristic functions occurs in the 
Hellinger-Kantorovich case. While the too distant parts are simply annihilated or created 
according to the Hellinger metric the parts that are close enough lead to a continuous 
transition, see Figure]^ 

In Section |5.5| we show that the Hellinger-Kantorovich geodesic between two measures 
/io and /ii is unique if one of the two measures is absolutely continuous with respect to 
the Lebesgue measure. Finally, Section 5.6 shows that hK is not semiconcave in M(r2) 
if D C has dimension two or higher, which is in sharp contrast to the Wasserstein 
distance, see |AGS05( Def. 12.3.1]. 
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This work, together with its companion paper |LMS15j . will form the basis of subse¬ 
quent work where we will explore the metric properties of the the space (M(r2), hK) and 
study gradient systems on this space. In particular, in the spirit of |LiM13] . we aim to 
establish a metric theory for scalar reaction-diffusion equations of the form 

ii = —]Kq,_^(m)5T('u) = div (aMV((5T(M)) — /3 m(5T'(m), 
where denotes a variational derivative. 

Note during final preparation. The earliest parts of the work presented here were 
first presented at the ERC Workshop on Optimal Transportation and Applications in Pisa 
in 2012. Since then the authors developed the theory continuously further and presented 
results at different workshops and seminars. We refer to |LMS151 Sect. A] for some remarks 
concerning the chronological development. In June 2015 they became aware of the parallel 
work |KMV15] . Moreover, in mid August 2015 we became aware of |CP*15al ICP*15b] . So 
far, these independent works are not reflected in the present version of this manuscript. 


2 Gradient structures for reaction-diffusion equations 


2.1 General philosophy for gradient systems 

We call a triple (X, T, T) a gradient system in the differentiable sense, if X is a Banach 
space containing the states u, if the functional T : X —)■ Moo := M U {c)o} has a Frechet 
sub differential DT(m) G X* on a suitable subset of X, and if T is a dissipation poten¬ 
tial. The latter means that \k(M, •) ; X —)■ [0, oo] is a lower semicontinuous and convex 
functional with \k(M, 0) = 0. Denoting by !&*(«,•) : X* —)■ [0, oo] the Legendre-Fenchel 
transform = sup{ {^,v) — T(m,u) | u G X }, the gradient evolution is given via 

ii G —DT(m)) or equivalently 0 = D^J/(m,-u) -|- DT(m). (2.1) 


For simplicity, we assume that the Frechet subdifferential DT and the convex subdiffer¬ 
entials and are single-valued, but the set-valued case can be treated similarly 

by the standard generalizations. 

If the map v h-)■ is quadratic, we call the above system a classical gradient 

system while otherwise we speak of generalized gradient systems. In the classical case we 
can write 

= ^('G(m)w,u) and ^ K(m)^), 


where G(m) : X —)■ X* and K.{u) : X* —>■ X are symmetric and positive (semi)definite 


u) ^ = 


iu 


operators. Since T and T* form a dual pair we have G(m)“^ = ]K(m) and 
if we interpret these identities in the sense of quadratic forms. We call G the Riemannian 
operator, as it generalizes the Riemannian tensor on hnite-dimensional manifolds, while we 
call IK the Onsager operator because of Onsager’s fundamental contributions in justifying 
gradient systems via his reciprocal relations K(m) = ]K(m)*, cf. |Ons31( Eqn. (1.11)] or 
|OnM531 Eqs. (2-l)-(2-4)]. Thus, for classical gradient systems the general form (2.1) 
specializes to 


ii = —]K(m)DT(m) or equivalently G{u)u = —DT(m). (2.2) 
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We emphasize that K.{u) maps (a subspace of) X* to X, so generalized thermodynamic 
driving forces are mapped to rates. Similarly G{u) maps rates to viscous dissipative forces, 
which have to balance the potential restoring force —D3 '(m). 

Our work follows the same philosophy as in |,TK098[ lOttOl] : Even though the above 
gradient structure is only formal, it may generate a new dissipation distance, which can 
be made rigorous such that hnally the gradient structure can be considered as a mathe¬ 
matically sound metric gradient flow as discussed in |AGS05j . For this one introduces the 
dissipation distance associated with the dissipation potential tk, which is dehned via 


Dk(mo,Mi)^ 



(G(m)m, it) ds 


u e H^([0, l];X),u{j) = Uj 


2.2 Dissipation distances for reaction-diffusion systems 

It was shown in |Miell] that certain reaction-diffusion systems admit a formal gradient 
structure, which is given by an Onsager operator IK and a driving functional 5” of the form 

K(c)^ = - div(M(c) V^) + e(c)^, 5(c) = / V F{ci) dx, 

i=i 


where c = (cj)j=i^...p is the vector of non-negative concentrations of the species Xj, i = 
1,..., J, and M(c) and M.{c) are a symmetric and positive dehnite mobility tensor and 
a reaction matrix, respectively. With the diffusion tensor D(c) = M(c)D^5(c) and the 
reaction term R{c) = ]HI(c)D5(c) the generated gradient-flow equation reads 


c = -K(c)D5(c) = div (D(c)Vc) - i^(c). 

As in the theory of the Kantorovich-Wasserstein distance (cf. |Ott981 l,TK098( lOttOl 
IVil09] ) the operator ]K(c) can be seen as the inverse of a metric tensor G(c) that gives rise 
to a geodesic distance between two densities Cq, Ci G L^(r2; [0, oo[^) dehned abstractly via 


Dk(co,Ci)^ ;= inf | 


(]K(c(f)) c{t),c{t))dt 


C 

Cq Cl 


(2.3) 


Here “cq ^ Ci” means that t i—)■ c(t) is a sufficiently smooth curve with c(0) = Cq and 
c(l) = Cl. 

Since in general the inversion of IK is difficult or even not well-dehned, it is better to 
use the following formulation in terms of the dual variable ^(s) = IK(c(s))c(s), namely 


Dk(co,Ci)^ := inf | 


c = IK(c)^, CoCl i. (2.4) 


In our case of reaction-diffusion operators we can make this even more explicit, namely 


Dk(co,Ci) 2 := inf 



: M(c)V^ + ^ ■ m{c)^dxdt 


0 Jn 


= - div(M(c)V^) + e(c)^, Co A Cl |... 


(2.5) 


Finally, we can use the Benamou-Brenier argument [BeBOnj to hnd the following charac¬ 
terization (cf. jLiMlSl Sect. 2.5]): 
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Proposition 2.1 We have the equivalence 


Dk(co,Ci)^ = inf I /" [ E 
'' Jo Jn 


= inf<! [ [ P 
Jo Jq 


(c)S + ^ ■ ]HI(c)^ da; dt 

c = — div (M(c)H) + ]HI(c)^, Co ^ Ci 
{c)~^P + s ■ EI(c)“^s da; dt 

c = - div(P) + s, Co Cl I 


( 2 . 6 ) 


where x) = EI(c(f, a;)) ^s(f, x) G W and H(t, a;) = M(c(f, a;)) x) G 


tilxd 


Proof: Clearly, the right-hand side in (2.6) gives a value that is smaller or equal than 


that in (2.5), because we have dropped the constraint H = V^. 


To show that the two dehnitions give the same value, we have to show that for mini- 
mizers (do they exist), the constraint H = is automatically satisfied. For this we use 
that ^ and H are related by the continuity equation c = — div(M(c)H) -|- ]HI(c)^. 


Keeping c fixed (and sufficiently smooth) we can minimize the integral in (2.6) with 


respect to ^ and H, which is a quadratic functional with an affine constraint. Hence, we 
can apply the Lagrange multiplier rule to 

£(S,^,A)= f f H : M(c)S-I-^ ■ ]HI(c)^-I-A • (c-I-div(M(c)H) — ]HI(c)^) dxdf 

Jo Jn 

to obtain the Euler-Lagrange equations 

0 = 2MS — MVA, 0 = 2]HI^ — HA, 0 = c -|- div(M(c)S) — H(c). 

/^From the hrst two equations we conclude S = |VA = V^, which is the desired result. ■ 


2.3 Scalar reaction-diffusion equations 

On O C which is a bounded and convex domain, we consider scalar equations of the 
form 

ii = div(a(M)V-u) — f{u) in O, Vu ■ z/ = 0 on dfl, 

where we assume that / changes sign, such that f{u){u—l) is positive for u G ]0, 1[U]1, cxd[. 
We want to write the above equation as a gradient system (X, T, IK) with X = L^(r2), 

T(m) = / F{u{x))dx, and ]K(m),^ = —div (/i(M)V.^)-|-^( m).^ with = 0, 

Jn 

where F : M —)■ [0, cx)] is a strictly convex function with F{u) = oo for m < 0 and 
F(l) = 0. Moreover, we assume fi{u),k{u) > 0, such that the dual dissipation potential 
is the nonnegative quadratic form 
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Using D5 '(m) = F'{u{x)) we obtain 

K(m)DJ(m) = -div {^i{u)F'\u)Vu) + k{u)F\u). 

Hence we see that we obtain the above reaction-diffusion equation, if we choose F, /r, and 
k such that the relations 


a{u) = ^{u)F''{u) and k{u)F'{u) = f{u). 


There are several canonical choices. Quite often one is interested in the case a = 1, 
which gives rise to the simple semilinear equation ii = Au — f{u). To realize this one 
chooses fi{u) = 1/F"{u). This is particularly interesting in the case of the logarithmic 
entropy where F{u) = Fb(-u) := u\ogu — u + l. Then, fi{u) = 1/F"{u) = u and we obtain 
the Wasserstein operator ^ i—>■ — div(MV.^) for the diffusion part. 

For the reaction part one simply chooses k{u) = f{u)/F'{u), which is positive, since 
f{u) and F'{u) change the sign at u = 1. For F = Fb and the equation 

ft = Au - k{u^ - u“) with 0 < a < /3 

we obtain k{u) = — u°‘)/\ogu = k(/ 3—a)A(u",u^), where the logarithmic mean is 

given via A(a, 6) = (a—6)/(log a—log 6). This equation models the evolution of a single 
diffusing species undergoing the creation-annihilation reaction 

aX ^ PX. 

K. 


The simplest example of a reaction-diffusion distance is the Hellinger-Kantorovich 
distance studied in Section in great detail. It is defined via the scalar Onsager 
operator 

:= -adiv(cVO +/9ce, (2.7) 

where a, /3 are nonnegative parameters. The special property of this operator is that it is 
linear in the variable c. This will allow us to do explicit calculations for the corresponding 
dissipation distance := Dk„^. In particular, the associated distance is defined for 
all pairs of (nonnegative and finite) measures fio, fii G M(f2), not just for probability 
measures tP(f2). In fact, we will see that for /3 > 0 the geodesic curves connecting to 
different probability measures will have mass less than one for all arclength parameters 
s G ]0,1[. 

For (3 = 0 we obtain the scaled Kantorovich-Wasserstein distance, namely 


Da,o(Ao, hi) 


W 


V“IaoI Mho 


ho 


hi 

Ihll 


oo 


if Ihol — |hi|) 
else. 


Here, |/ij| = is the total mass of the measure. The geodesic curves are given in 

terms of the classical optimal transport, see |AGSn5| Ch. 7]. 

For a = 0 we obtain a scaled version of the Hellinger distance (sometimes also called 
Hellinger-Kakutani distance), namely 













for a reference measure /i* with jii /r* (e.g. /i* = /xq + /^i), see |Sch96t Theorem 4], The 
geodesic curves are given by linear interpolation of the square roots of the densities, i.e. 


((!■*)^ p.s) 

By using the estimate /i^(s) > (1—s)^/io+’S^/ii and choosing s G [0,1] optimally, we obtain 
the lower estimate |/i^(s)| > |/io||hi|/(|/^o| + |/^i|), he. the total mass of the geodesic /x^(s) 
is bounded from below by half of the harmonic mean of the total masses of fio and /xi. 
Moreover, an elementary calculation gives the identity 


|/i^(s)| = (l-s)|/io| + s|/ii| - s(l-s)H(/io,/ii)^ 


(2.9) 


3 The Hellinger—Kantorovich distance 

In this section we discuss the dissipation distance D„^^(/io,/ii) that is induced by the 
Onsager operator ]Kq,^^(c)^ = — div(acV^) + /3c^, given for /io,/ii G M(r2) as in (2.3). 
Using Proposition |2.1| we can rewrite this formulation in an equivalent form as 


Da,/3(/io,hi)^ = inf 



0 JQ 


d/i(s) ds 


+ adiv(/iH) = iMi All 


(3.1) 


with S : [0,1] X n —>■ denoting the vector held. 

In most of this section we will restrict ourselves without loss of generality to the case 
a = 1 and (3 = A for simplicity. Occasionally, we will give some of the formulas for 
general a and 13 to highlight the dependence on these parameters. Note that we can 
always use the simple scaling = /SIKa/^ i giving the general relation DQ,^^(/io,/xi) = 
Do// 9, i(/xo,/xi)/\//3. Moreover, the factor ^Ja/(3 can be transformed away by rescaling O, 
i.e. X I—)■ \/a/f3x. 


Note that for sufficiently regular /x, and S in (3.1) we obtain by Proposition 2.1 


= and formal calculation leads to the following system of equations for geodesic 


curves 


, Q! Q 

fi = -adiv(/xV0 +/3he, e + 2 |ver + (3.2) 

For the case (3 = 0 and a = 1 this corresponds to |BeB00( Eqn. (37)]. A full justihcation 
of this coupled system is given in |LMS15( Sect. 8.6]. 

3.1 The optimal curves for Da .,(3 with one or two mass-points 

The striking feature of optimal transport is that for affine mobilities point masses (Dirac 
measures) are transported as point masses, i.e. the geodesic curve connecting /xq = (5xo 
and /ii = 6xi is given by /Xs = 6x{s), where [0,1] 9 s i—)■ x{s) is a geodesic curve in the 
underlying domain D. 
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Since the Onsager operator ]Kq^^(c) in (2.7) depends only linearly on the state c, we 
expect a similar behavior. In particular, note that the dehnition of the distance in (3.1) 
is well-dehned for general curves of measures /i(s) G M(f2) if we understand the linear 
constraint -^/i + adiv(jjS) = /3/r^ in the distributional sense. 

As a hrst step, it is instructive to study the Ka^^-length of curves given by a moving 
point mass in the form 


lx,a : [0,1] 9 s H->• /r(s) = a{s)Sx{s) with x{s) G and a(s) > 0. 


Minimizing the action functional in (3.1) only over curves of this form for given end points 
ttiSxi, f = 0,1, always gives an upper bound for the distance DQ,_^(ao5a;o)Indeed, 
we show that up to a certain threshold for the Euclidean distance Ixq—X i| it will even be 
the exact distance and the minimizing 'yx,a is a geodesic curve. 

The main point is that we are able to calculate the s-derivative of /i(s) = 7 a;,a(s) 
and compare it to the continuity equation. Multiplying the continuity equation with test 
functions we obtain after integration by parts 

^/ri(s) = -div (x(s)a(s)4(s)) + a(s)(5a;(^) = - div (x(s)/r(s)) + 

Thus, comparing with the continuity equation in the dehnition of we hnd the relations 


S(s,x(s)) =-i:(s) and ^(s,a:(s)) = (3.3) 

a pa[s) 

We may realize the constraint H = via ^{s,y) = ^(s,x(s)) + ^x(s) • {y—x{s)). 

Having identihed the vector and scalar held H and respectively, we obtain the 
length of the curve s h->• a{s)6x(s) via 


Length^^^{-fx,aY 




1 

la 


\a{s] 


a(s) ds. 


(3.4) 


for CK = 0 and /3 = 8 this corresponds to the representation in jSch96( Thm. 4] for the 
Hellinger-Kakutani distance. Minimizing this expression for given endpoints of 'yx,a we 
hnd that x(s) travels along a straight line, which rehects the fact that our choice of metric 
in H is the Euclidean one. However, the speed will not be constant. Hence, we introduce 
functions 


pG 91(0,1) :={pGHi(0,l)|p(0) = 0, p(l) = 1, p>0} 
a G 2l(ao, fli) := { a G H^(0,1) I a > 0, a(0) = Oq, a(l) = cq } 


such that we can write x(s) = (1—p(s))xo + p(s)xi and have |x(s)| = p{s)L with L = 
|xi—xo|. For the one-mass-point problem we dehne the function : [0, oo[ x [0, (X)[^ —?• 
[0, oo[ via 


^p{sfa{s) + dg p G 91(0,1), a G 2t(ao, ai)|. (3.6) 


The functional 
For 0 > 0 we have 


satishes a scaling identity with respect to the parameter a, /I > 0: 


^Imp 

Oa,y 


(L^, oo, Oi) 



(3.7) 
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Hence, we can restrict ourselves to one particular choice oi (3/9 > Q such that the general 
case can be recovered from a rescaling of the Euclidean distance in hi. In particular, it 
will prove convenient to choose 9 = (3/A such that we will consider first, which is 
also the scaling used in |LMS15j . 


Theorem 3.1 We have 

jlmp / r2 


oo, Oi) = Oo + Oi — cos,r(-h) with 

cos(L) for L < TT, 

-1 for L>n. 


COS,r(-h) : = 


(3,8) 


The infimum is a minimum for L < 7i, and it is attained for 

a{s) = (1—s)^ao + s^ai + 2s(l—s)A/aoai cos(L), 

r zarctan((^3^y^|g^g^) z/(l-s)ya^ + s cos(L)yhI > 0, 

P('S) = < ir if {l-s)y/df) +scos{L)^ = 0, 

I yarctaufy^—f otherwise. 

V L \ {l—s)y/ao-\-s cos{L)y/ai / L 

For L > TT the minimizing sequences converge to a(s) = c^s—d)"^ and p{s) = Sg( 
certain c > 0 and 9 G [0,1]. 


(3.9) 


s) for 


Proof: To study the infimum of we transform the system by using b{s) = ■\/a{s). 
Keeping L > 0 fixed we obtain the functional 

XL{b,p)= f L^6(s)^p(s)^ + 6(s)^ds. 

Jo 

Clearly, the infimum of %l gives the infimum in the definition of We now consider 
a minimizing sequence {bn, Pn) and observe that {bn) must remain bounded in H^(0,1). 
Hence, after choosing a suitable subsequence (not relabeled), we may assume bn ^ b in 
H^(0,1). We distinguish between the following three cases. 

Case 1. b := min{ b{s) | s G [0,1] } > 0: In this case we may further conclude that pn 
is also bounded in H^(0,1). Hence, we can also assume pn ^ p in H^(0,1). Because of the 
lower semicontinuity of Xl on H^(0,1)^ we conclude that {b, p) is the global minimizer 
of Xl- This implies that {a, p) = {b‘^,p) is the global minimizer of which certainly 
satisfies the Euler-Lagrange equations 

fsiH = 0’ 4(Lp)^ - {d/aY - 2^(a/a) = 0. 


y^From the first equation and pds = 1 we obtain 

p{s) = H[a]/a{s), where H[a] = ( l/a(s)ds 


-1 


denotes the harmonic mean, which satishes H[a] > 6^ > 0. By inserting this into the 
second equation we see that all solutions are given in the form 

a{s) = Co + ci{s—9)‘^ with cqCi = L‘^H[a]‘^. 
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Hence, together with the boundary conditions oq = a(0) = cq + Ci 6 *^ and ai = a(l) = 
Co + Cl ( 1 — 0 )^ we have three nonlinear equations for the unknowns Cq, Ci, and 6, which can 
be solved easily for L < tt giving a unique solution. For L > vr no solution with positive 
b exists. In particular, since the primitive of the inverse of a strictly positive quadratic 


function is given in terms of the arctan function we obtain the formulas in (3.9), using 
also the addition theorem for arctan. 

Thus, in the case b>0 the global minimizer is the unique, positive critical point (a, p). 

Case 2. 6 = 0: Since b is continuous, there exists s* G [0,1] with 6 (s*) = 0. Neglecting 
the term in the integrand in %l we can minimize the remaining quadratic term 

subject to the boundary conditions 6(0) = ydfo, 6(1) = and 6 (s*) = 0. This leads to 
a minimizer that is piecewise affine and gives the lower bound 


Xl(6, p) > - + ^ > (^/h^ + yhl)', 

-L 5* 


(3.10) 


where the last estimate follows from minimization in s*. 

It is now easy to see that the value (ydio + ydzi) is indeed the inhmum, since it can 
be obtained as a limit of a minimizing sequence. For this take piecewise affine functions 
( 6 n,Pn) satisfying ( 6 ^), pn(s)) = (0,n) for s G [s„,s„+l/n] with Sn -)■ sq, where sq is 
the optimal s* in (3.10). On [0, s„] we take {bn{s),p{s)) = (—ydzo/s„,0) and similarly on 
[s„+l/n, 1 ]. 

General case: Since the inhmum obtained in case 2 is strictly larger than that in 
case 1 (because of L < tt), we see that the two cases exclude each other. If L < tt then 
case 1 occurs while for L > n case 2 sets in. Hence, the theorem is established. ■ 


Although the stationary states in Theorem 3.8 may be the global minimizers, they 


are not always the geodesic curves with respect to Di 4 . To see this we consider the pure 
reaction case and dehne the curve 


qao,ai('S) T 01(5)63,^ 

also connecting the measures pj = ajbxj if %(_)) = aj and aj{l—j) = 0 for j = 0,1. If 
ao(s), ai(s) > 0 , this curve consists of two separated mass points that do not move. As in 
the previous case of the moving mass point we can compute the solutions of the continuity 
equation to obtain 

^ N aAs) 

= "0 and = 

The squared length of these curves is given by | fo ^j/^j and the optimal choice 
for ttj is ao(s) = ao(l—s)^ and ai(s) = giving the minimal squared length 

Length 4 4 ( 7 opt)^ = oq + oi > Di,4(00^0, 

We see that this result is less that XqP, Oq, Oq) for 7r/2 < |a;i—X qI < tt. In fact, 

we will show later that the last estimate is sharp if and only if \xi—Xq\ > ti/2. 
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Figure 1: Top: The curves s i—)■ (L p{t) dr, a{s)) for different values of 0 < L < vr. Solid curves 
are true geodesics, while dashed curves are shortest “one-mass-point paths” but not geodesic 
curves. Bottom: Curves for L = 7r/2 and different mass ratios ao/ai- 


To highlight the dependencies on a and [3 in we can use the scaling ( |3.7 ) for 
all a,f3 > 0. To include the limit cases of the Hellinger distance (i.e. a = 0) and the 
Kantorovich-Wasserstein distance (i.e. /3 = 0) we define the functions &a,p via 


bo, bi) 



for a, /3 > 0, 

for a = L = 0, and (3 > 0, 
for /3 = 0, a > 0, and bi = bo, 
ioY a = (3 = L = 0 and bo = bi, 
otherwise. 


(3.11) 

We emphasize that ©o ,/3 and can be obtained as T-limits of &a„,i3n for /^n \ 0 or 
ttn \ 0, respectively. 

Using &a,i 3 we can express for all a, /3 > 0, where the cases a = 0 or /3 = 0 mean 
that p = 0 or a = 0, respectively. Moreover = -|-oo, if the set of competitors (a, p) 
providing finite values is empty. 


Corollary 3.2 For all a, (3 > 0 we have 

fli) = &a,p{L'^, y/o^)- 
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Figure 2: The function s i—>• p{s) in Theorem 3.1 for different ratios oo/ai and L = 7 r /2 (left) 
and L = tt/I.I (right). The dashed curve corresponds to ao/oi = 1, while curves above and 
below satisfy ao/ai < 1 and ao/oi > 1 , respectively. 


Proof: We only need to consider the boundary cases. 

For a = /] = 0 we have a = 0 = p, which implies that is hnite only for L = 0 and 

(2q = Oj\. 

For 0 = 0 and /? > 0 we have p = 0 and obtain a hnite value only for L = 0. Clearly, 
the inhmum of a?/(13a) ds is given by A(^Ja(l) — a(Q))'^/(3, which is the desired result. 

The case (3 = 0 and a > 0 provides a = 0 and p = 1. Hence, the inhmum is L^a^/a 
for oi = Oo and oo otherwise. ■ 


Example 3.3 (Mass splitting) At the end of this subsection we give a more complicated 
example for an optimal curve consisting of two point masses. We want to connect the 
measures po = aoSxg and pi = aidxQ + where L = |a;o—Xil < 'k/ 2, i.e. cos^(L) = 

cos(L) > 0. So the guestion is how much of the mass at xq is kept there, how much of the 
mass is used for transport, and how much mass is created at xi. We consider the curve 

7 (s) = a(s)5xo + c(s)5x{s) + b(s)5x^ 

with a(s),h(s),c(s) > 0 and the boundary conditions 

a;( 0 ) = xo, a;(l) = xi, a( 0 ) + c( 0 ) = oq, a(l) = oi, 6 ( 0 ) = 0 , c(l) + 6 ( 1 ) = hi. 

Choosing a = 1 and (3 = A and optimizing each of the given three curves under their own 
boundary conditions gives 

Lengthy 4 ( 7 )^ = (^/af^-^/a(l))‘^ + c(0)+c(l) - 2^c(0)c(l)cos^(L) + 6(1). 

^From the constraint c(l) + 6(1) = 61 and the second last term, we see that it is optimal 
to choose c(l) as large as possible, namely c(l) = 61 and 6(1) = 0. In particular, we have 
no creation at xi, i.e. 6 = 0. Setting cq = c(0) and eliminating a(0) = oq — cq we find 

Length 4 4 ( 7 )^ = Oo - 2^/a(y/ao-Co - 2cos.„(L)\/bp^ + 61 + Oi. 
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The minimal value is achieved for the choice Cq = ao&iCos^(L)^/(ai+ 6 iCos^(L)^), which 
means a mass splitting as 0 < Cq < Oq. Hence, we have established the estimate 

Di, 4(/^0; = Di^4(ao5a;o 5 (^I^xq + < Oq + «! + — 2 a/Oq( ai+&lCOSjr(I^ q—|) 2). 


In fact, it will be shown in Example |5'.g| that the curve 7 is indeed a geodesic curve, i.e. 
“<” can be replaced by 


3.2 Optimal transport on the cone 

The crucial point in the characterization of the distance induced by the Onsager 
operator in (3.1), for a = 1 and /3 = 4, is that the functional in (3.6), which 


gives the cost for optimally transporting a single mass point, is closely related to the 
metric construction of a cone over the metric space (ff, | ■ |). We will briefly explain the 
construction in this section and refer to |BBI011 Sect. 3.6.2] for more details. 

Given the closed and convex domain G C we construct the cone Cq as the quotient 
of G X [0, cxd[ over Q x {0}, i.e.. 


Cq := (^x [0,oo[)/((]x {0}) 


In particular, all points in G x {0} are identihed with one point, namely the tip of the 
cone denoted by 0 . For any x G G and r > 0 the equivalence classes are denoted by 
2 ; = [x,r] G Cq while for r = 0 the equivalence class [x, 0] is equal to 0 . 

Motivated by the previous section we dehne the distance : Cq x Cq —)■ [0, 00 [ on the 
cone space Cq as follows: 


dff([xo,ro], [xi,ri])^ := r^ + r^ - 2roricos^(|xi-xo|), 
where coSj^ is dehned as in Theorem |3.1 

For the special case that G = [0, £] C with 0 < £ < 27r, we can visualize Cq = C[o,£] 
by the two-dimensional sector 


:= {y = (rcosx,rsinx) G M | r > 0, x G [O,^] }, 

where y = (0, 0) corresponds to the tip 0 . The induced distance is the Euclidian distance 
restricted to i.e. the geodesic curve between r/o and yi is a straight segment if |xi—Xo| < 
71 while it consists of the two rays connecting y = 0 with yQ and yi, respectively, if 
TT < /i—a^ol < < 27r, see Figure]^ In the case of the traveling mass point discussed in 

the previous section we identify the Dirac measures Uidxi with pairs [xj, G Cq. Thus, 
the result of Theorem 13.11 can be reformulated as 

ao, ai) = de;([xo, a/oo], /i, a/oiD^ 

For general coefficients a, (3 > 0 the distance d^ has to be replaced by 
d/^( 2 :o,D) = d/^([xo,ro], [xi,ri]) = ^ 6 „,; 3 (|xi-Xo|,ro,n). 
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2/2 



Figure 3: The cone Gi[o, 37 r/ 2 ] represented as sector in via (yi,y 2 ) = (ncosx,rsinx) and three 
geodesic cnrves. The angle x = vr is critical for smoothness of geodesic curves. 


see (3.11). This distance can be seen as a geodesic distance on the cone Cq induced by a 
Riemannian metric (outside of the vertex o) given by the tensor 


■= 


TIT 

a ® 

0 


0 

4//3 




(3.12) 


This fact is already seen in (3.4) if we set a{s) = r(s)^. 


3.2.1 Geodesic curves in the cone space 

As is shown in |BBI01[ Sect. 3.6.2], the pair ((lQ,de;) is a complete geodesic space, where 
each two points can be connected by a unique arclength-parameterized geodesic curve. 
These curves are given by the following geodesic interpolator Z{s] ■, ■) : x *^0 —t 

(recall Zj = [xj^Vj]) 

Z{s-,Zo,Zi) := [X{s; zo, zi), R{s; zo, zi)] where 
R{s; Zo, Zif := (l-s)Vo + + 2s(l-s)roricos^(|xo-xi|), 


X{s;zo,Zi) := {l-p{s; Zq, Zi))xo + p{s] Zq, Zi)xi, 


p{s]Zo,Zi) 


\xi-Xo 


arccos 


( 1 —s)ro + sri cos |a;i—X qI 


^ R{s;zo,Zi) 

- (^1 + sign ((l-s)ro - sri) j 


for l^i—Xol < TT, 

for l^i—Xol > vr. 

(3.13) 


Note that in the dehnition of R there is a “+” in front of the cosine term, while there is 
” in the distance d^. Moreover, by elementary geometric identities it is easy to see 


a 


that the formula for p is equivalent to the one obtained in Theorem 3.1 


In particular, the curve dehned by 7 ( 5 ) = Z{s;zo,zi) is a constant speed geodesic 
curve with respect to the distance d^ connecting zq and Zi, i.e.. 


VO < 5 < 1 < 1 : dc(7(s),7(l)) ^ |l- 5 |dc( 2 o,21). 


(3.14) 


As for the Wasserstein-Kantorovich distance, the geodesic curves in Cn are key for the 
construction of the geodesic curves with respect to the Hellinger-Kantorovich distance. 
We will discuss this in Section [T4l in detail. 
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3.2.2 The transport distance modulo reservoirs on the cone space 


Using the theory of optimal transport (cf. |AGSn5|IViin9] ) we can dehne the Kantorovich- 
Wasserstein distance associated with the cone distance dc on the set of all nonnegative 
and hnite measures M2(£o) as follows. If Ao((to) 7^ Ai(£n), then we set W(r(Ao, Ai) = 00 
and otherwise we set 


W|r(Ao, Ai)^ 




Zif Zi) 


7 G M(£nX^r2)5 n^7 



Moreover, there the geodesic interpolation (which is not unique in general) can be de¬ 
scribed using an optimal transport plan 7 that is a minimizer in the dehnition of W(r. We 
denote the geodesic interpolator by 


Z.^(s;Ao,Ai) := Z(s;-,-)#7. (3.15) 

For the proper handling of creation and annihilation of mass we introduce a modihed 
distance. The modihcation occurs via a reservoir of mass in the vertex 0 of the cone 
such that mass is generated from the reservoir and absorbed into it. The following result 
shows that if we assume that the reservoir is sufficiently big (and in our model for the 
Hellinger-Kantorovich operator it is in fact inhnite), we never have any true transport 
over distances larger than 7 r /2 (respectively, ^Ja /in the scaled case), which is only 
half the critical distance of possible transport. 


Proposition 3.4 (Optimal transport in the presence of large reservoirs) We con¬ 
sider arbitrary measures Aq, Ai G M2(£n) with equal masses Ao(£o) = Ai(£n)- 

(a) The function [ 0 , cxd[ 3 k ^ w(Ao, Ai, k) := nda-hAi) is nonincreasing. 

(b) Define the real numbers 

9 j := Aj(( 2 :n\o), Pj := Aj({o}), and = max{ 0 , 6'1-po, 6'0-Pi}, 

then for all k> we have tc(Ao, Ai, k) = ta(Ao, Ai, k*) with similar transport plans 
differing only by {k—h^,) 6 o,o- 

(c) For any optimal transport plan 7 connecting k*( 5 o-|-Ao and k*( 5 o+Ai we have 7(Tl) = 

0, where 

Tt := {{zo,Zi) G (fnXlfn | > 0 , laio-a^il > 7 r/ 2 }, 

i.e., there is no transport in over distances longer than 'k 12 . 


Proof: ad (a) Let 0 < < ^2 be given and let 7^1, 7 k 2 ^ A[2(£ox£n) be optimal 

transport plans for the pairs Kiffo+Ai and K 2 ^o+Ai, i = 0, 1 , respectively. Since K 2 > we 
can dehne the transport plan 7^3 = +7 ki, which satishes = ^2^0 + \- 

Thus, 7 k 2 is an admissible plan for the minimization problem in the dehnition of \N^ and 
we obtain the estimate 


w{Xo, Ai, K2Y — W(r(K2(5o + Ao, ^2<5d + Ai)^ < 


^t{zo,Zifdrj^^{zo,Zi) 


'CnxCn 

dc{zo,Zif<l'-^^^{zQ,Zi) = w(Ao, Ai,Ki)^. 


CnxCn 
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Hence, n i—)■ w{Xo, Ai, k) is not increasing. 

ad (b) Let 7^ and 7^, denote the optimal transport plans with respect to k and k*, 
respectively. Similar to (a) we dehne the measure 7 k, = 7 k — «^*)^(o,o)- Obviously, 7 k, 

satishes n^7K, = A* + It remains to show that 7 k, is nonnegative and hence is an 
admissible transport plan. Indeed, due to the estimate 7 K({o}x(£n \ {0})) < 6*1 and the 
dehnition of k* we obtain 

7k({o}x{o}) = 7 K({o}xC:n) - 7 «({o}x(c:o \ {0})) 

> K, + Po- 9 i> K - K,^. 


Hence, 7 k, > 0 and, arguing as for (a) we get w{\o, Ai, k*) < w{Xo, Ai, n). However, due 
to the hrst part of the theorem even equality must hold. 

ad (c) As before let 7 denote an optimal plan for lifts Aq and Ai of po and pi. Assume 
that 7(f)T) > 0 such that cl(r(zo, > Tq + for 7-a.a. {zq, Zi) G 91 with Zi = [x*, Uj]. We 
aim to construct a new transport plan 7 based on 7 giving a strictly lower cost and hence 
showing the non-optimality of 7. 

To this end, we introduce the characteristic function y of the subset 91 ^ := (£oX (in) \ 
91 . Moreover, we denote by Aj G M2(£n) the marginals of = 77, which are obviously 
absolutely continuous with respect to A*. We denote the densities with pi such that 
Ai = piXi- In particular, for Aj-a.e. 2; G we have that 0 < pi < 1. 

We dehne the measure 7 

7(d^o, dzi) = 7x(dxo, d2;i) + (l-po)Ao(dzo)5D(d^i) + 5o(d2;o)(l-pi)Ai(d2;i). 

We easily check that the marginals of 7 are given by A* = Aj -|- z = 0, 1 , where k > 0 
is given by k = (7—7^)((£:qx£q). In particular, Aj is an admissible lift for pi. 

It remains to show that 7 has a strictly lower cost than 7. We compute 


dt{zo,zifdrj = 




dt{zo,zifd'y-^+ / ro(l-po)dAo+ / ri(l-pi)dAi 








<yic 


// {rl + rDd'y 


or 


< 


dtizo.zifd'y. 


CnxCn 


Thus, 7 cannot be optimal and any optimal transport plan has to vanish on 91. ■ 

Using the above proposition we may dehne a new distance Wj-sv on M2(£o) that 
assumes that the reservoir is always big enough. Indeed, dehning 


Wi.sv(Ao 5 Ai) inf W£(Ao+zc5o) Ai-|-K(5o) — W£(Ao+zc*(5o; Ai-I-k^^o), 

K>0 

where k* is given as in Proposition |3.4Kb). 


3.3 The Hellinger—Kantorovich distance 

We can now easily dehne a distance for measures on by lifting measures pj G M(t2) to 
measures on M2(£o) and projecting back measures from M2 (Co) into M(t2). We dehne 
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the projection ^ : M2 (Co) —)■ M(f2) via 


/ 0(x)d^A= / r^0(a;)dA([a;,r]) for all G C°(f2). 

'n Jtn 


In the last formula we use that for r > 0 the equivalence class [x, r] uniquely determines 
X and r and that the prefactor makes the function <h : [x, r] 1— )■ r^0(x) continuous if we 
set <I)(o) = 0. In the case /x = ^A we call A a lift of /x. 

The first and most intuitive result on the distance Di 4 induced by the Onsager operator 
]Ki 4 in (3.1) is the following formula, which we formulate as a definition first and then 
show that it equals the distance Di^4. 

Definition 3.5 The Hellinger-Kantorovich distance on M(r2) is defined as 

|-K(/xo,/xi) = minj We;(Ao,Ai) ^Aq =/xq, ^Ai =/xi (3.16) 

Before proving the identity hK = 1-Ki,4 = D14 in Section]^ we collect some properties 
of hK. First, we emphasize that the projection ^ does not see the reservoirs at 0, hence 


the above formula already includes arbitrary reservoirs according to Proposition 3.4 


Next, let us remark that hK satisfies an important scaling invariance: Let x? : Cq x Cn —t 
]0, oo[ be a Borel map and define the dilation function t CoxCf^ via 


h^{zo,zi) = 


xo, 


ro 


1^{Zo,Zi) 


Xi, 


ri 


1^{Zo,Zi) 


j ioT Zi = [Xi,ri]. (3.17) 


Then, given any transport plan 7 G M2 (CoxCq) we dehne the dilated plan 7,^ = ■y) 

in M(CQxCn). Letting Aj and Af denote the marginals of 7 and 7,9 we have that 



dt{zo, zif d-y 



dt{zo, zif d'y^ 


and ^Ai=^Af. 


(3.18) 


In particular, we can always assume that the transport plans 7 and the lifts A* are prob¬ 
ability measures, e.g. by setting ■§ = (7 (CqxCq))“^/^. 

The main result of this section is the following structural theorem. For a full proof 
we refer to |LMS15] . where a more general case is considered. In particular, there is 
replaced by general complete geodesic spaces. However, because of the strong relevance of 
part (v) for the subsequent applications, we present a full proof of the identity hK = Di^4 
in Section m 


Theorem 3.6 (Properties of hK) The distance hK : M(r2) x M(r2) —)• [0 ,oo[ has the 
following properties: 

i) For each pair /xo,/xi there exists an optimal pair Aq, Ai of lifts; 
a) For all measures /xo,/xi the upper bound hK(/io,/xi)^ < /xo(fl) -h /(Xi(r2) is satisfied; 
Hi) (M(r2), hK) is a complete and separable metric space; 

iv) The topology induced by hK coincides with the weak topology on M(r2); 

v) The distance hK is induced by the Onsager operator Ki 4. 
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3.3.1 Consistency of above formulas with distance of Dirac masses 


We come back to the example of the optimal transport and absorption/desorption of two 
point masses in Snbsection and discnss the consistency of the above formnlas. Let 
Po = aoSxo and pi = aiSx^ denote two Dirac masses snch that oq, ai > 0. We consider 
lifts Ao, Ai G M 2 (£n) of the particular form 

Ao = + ^S[xQ^ro], and Xi = (k + 

where k > 0 and r* > 0 are arbitrary but hxed constants. In particular, we have equal 
mass Ao(£o) = Ai((fQ) and fpA* = /ij, i.e., Xi is indeed a lift for /ij. 

The possible transport plans 7 G M 2 ((fQxCQ) are uniquely characterized by the value 
g := 7 ({[xo, To], [xi, ri]}) G [0, minlao/rQ, Oi/r^}], where the interval boundaries corre¬ 
spond to complete absorption/desorption and complete transport. 

Denoting Zi = [xi,ri] we hnd 



zif Zi) 


^dc{zo,of + ^dc{zi,oy 

Tq r( 

+ g [d£(2;o, zif - dc{zo, of - d^{zi, 0 )^] 

ao + ai- 2groricos^{\xo-xi\). 


To get the optimal cost we have to minimize with respect to G [0, minlao/rQ, oi/r^}]: 
For L := |a:o—> 7r/2 the optimal value is = 0, which corresponds to the pure 
Hellinger reaction case. For L = 71 /2 any g is possible giving a convex set of optimal 
plans. In fact, it will be shown in Sectionthat any pair of lifts is optimal in this case. 
Hence, the case L > 7r/2 yields We;(Ao, Ai) = y/ao+ai = l-K(ao(5xo) oi<5a;i) as desired. For 
L < 7r/2 we have to choose g = minlao/rQ, Oi/rf}, i.e., the maximal value. With this we 
obtain 

We:(Ao, Ai)^ = ao + ai- 2g^{ro, ri) cos(L), 


where 5 '*(ro,ri) := min{ao^, Oi^}. In particular, different lifts A* = Ai(ro,ri) give differ¬ 
ent costs. However, an easy calculation shows that for ti/tq = an optimal value 

is achieved, such that \N^{Xq, Ai)^ = Oq + Oi — 2 y/a^a{cos{L) = l-K(ao( 5 a:o 5 

For calculating the distance hK the particular choice of an optimal lift is not important, 
but we will see in Section 5.2 that in the case L = 7 r /2 different lifts may give rise to 
different geodesic curves. Hence, we highlight here that even in the trivial case L < tt/2 
there are many optimal lifts. E.g. for Oi = Oq > 0 any 7 G M 2 ([ 0 , oo[) with r^dp = Oq 
dehnes optimal lifts Xj = 


3.3.2 Logarithmic-entropy transport functional 

In this subsection we give the formula for the distance via a minimization problem and 
discuss a few of its properties, in particular its consistency with the distance of Dirac 
masses. We do this for the case of general positive a and jS. 

Using the Boltzmann function Fb(p) = plogp — p -|- 1 >0 with F^{p) = logp and 
Fb{p) = 0 for p = 0 we dehne the Hellinger-Kantorovich functional for any po, G M(D) 
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as follows. For r] G M(f2xf2) we define the marginals rjj = and assnme rjQ -C /tq and 
hi ^ /^i and define the Hellinger-Kantorovich entropy-transport fnnctional via 


where the cost fnnction is given by 




-|log (^cos(^/3/(4a)L)j 
oo 


for L < 7 ^^y~a/~/3, 
for L > Tiyja/(3. 


We see that /iq, /ii) is a convex fnnctional, thns it is easy to find minimizers. The 

following characterization is proved in fnll detail in |LMS15] . Here we will only motivate 
the constrnction by giving some examples. 


Theorem 3.7 (Characterization of hKa,/? via minimization) For a,/? > 0 the dis¬ 
tance induced by the Onsager operator is given as follows: 


l-K„,/ 3 (/ro,/ii) = Er„,^(/ro,/ii) := inf h'O, h-i) V e M(HxH), pj -C Hj 


For every pair (/io,/ii) at least one minimizer p exists, which we call a calibration measure 
for this pair. 

Moreover, an optimal calibration measure p satisfies for Qi := dr^j/d/ij the following 
optimality conditions 


\xo—Xi\ < 7i\/a/(3 for p-a.e. {xo,Xi) E flxfl, 
go{xo)gi{xi) < cos {\/(3/ (4a)|a;o—xi |)^ for fVQ-a.e. xq G H and ixi-a.e. xi G H, 
ho(a^o)hi(a^i) = cos (\//3/(4a)|xo—xi|)^ > 0 for p-a.e. (xo,xi) G HxH. 

In the framework of this paper, the relevance of this new characterization of Di 4 = hK 
is that the minimization of 4 is mnch simpler than the characterization of hK in 
terms of lifts to the cone space. Finding the optimal lifts and the calcnlating the optimal 
transport on the cone space is certainly more involved. In |LMS15] . it is shown that 
4 has a mnch stronger intrinsic valne and it proves an essential tool for establishing 
the resnlts in Theorem 13.61 


Example 3.8 (Mass splitting, part 2) We return to Example \3. 5] where we calculated 
the distance between 


ho — hi ~ aiSxg -h bid. 




with L = |xo—Xi| < TT and a = 1, f3 = A. We show that the formulation in Theorem 3.1 
indeed gives the same cost. Since the marginals pj have to have a density with respect to 
fij and since po and pi must have egual mass, we consider 


Vo — ^o^xo o.nd Pi — ( 60 — 61 )^ 2 , 1 , -h 6 i(5, 


XI 


with 60,61 > 0. Using the formula in Theorem 3.1 yields for ET = ETi^i and c = Ci^ 

Er(/io,hi) = inf{Fh(f^)ao -h FB(^^2^)ai -h FB{^)bi + eic{L) | 60 > 61 > 0}. 

This infimum can be evaluated explicitly and we obtain 

ET(/io, hi) = hK(/io, hi)^ = ao -h ai -h 61 - 2^ao(ai-h&i cos(|xo-xi|) 2 ). 


which is the same as in Example 3.3. 


21 
























3.3.3 Reduction to special lifts 

The characterization of the Hellinger-Kantorovich distance in terms of the logarithmic- 
entropy transport functional gives rise to another helpful property: To calculate l-K(/io, /ii) 
it is sufficient to consider lifts Aj of a special form only. Indeed, assume that rj G 
is a minimizer of ^^ 1,4 for given /xq and /ii and consider for rji = the Lebesgue 

decomposition /Xj = aiTji -|- /i^. Then, the transport plan 7 ^ G M(CQx£n) dehned by 


7,(d^o,d0i) = 5^/7^(dro)<5^/7^(dri)7(da;o,da;i) 

(5i(dro)/Uo (da:o)5o(d2:i) + (5o(dzo)^i(dri)/x]^(da:i) 


and the associated lifts A, = are optimal in the Dehnition 3.5 for hK, see |LMS15[ 
Thm. 7.21] for the proof. In particular, we can restrict the analysis to lifts of /ij charac¬ 
terized by a single positive function iy > 0 on fl, namely 


£(/i, r, k) = k 5 o + ^f(a;)(dr)/i(dx), such that 

r[x)^ 

f ^{z)dil{n,r,K) = K^{o) + j d/r 

Jcn Jn r{xy 

We collect this observation in the following result. 


for all $ G C°(£n). 


Proposition 3.9 (hK via special lifts) We have the equivalent characterization 


hK(/io ,/ii) = minj We;(£(/xo,^o,Ko),Ai(/ii,fi,Ki)) > 0, > 0 |. (3.19) 

Moreover, it is sufficient to consider transport plans 7 G M 2 ((fox£n) of the form 
7 = ^fo(^o)(d^^o)ho(dxo)5o(d2:i) + (5o(dzo)(5fi(xi)(dri)7i(dxi) 

-h <5fo(*o)(d?^o)<5fi(xi)(dri)7(da;o, dxi) 

for positive functions fj : —)■ ]0, cxd[ and measures rji G M(f2) and rj G M(f2 x fl). 

Using the dehnition of in terms of and the form of the lifts, the functional in 


(3.19) can be written as 


^(hUo,ffi;ho,hi) :=/io(^)+hi(^) - / 2fo(a;o)ri(xi)cos^/2 |a;o-a;i|d?7(a;o,a;i), 

J flxQ 


and the following characterization of hK follows: 


hK(/io,/ii) = min ^( 7 , fo, fi;/ xq,/ ii) p e M{QxQ),W^r] = rffij + fi 


(3.20) 


We emphasize that not all optimal transport plans are of the form depicted in Proposi¬ 
tion 3.9 In particular, using again the example of two mass-points we show in Section [53 
that in the case of the critical distance Ixq— xi| lifts are quite arbitrary. 
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3.3.4 Recovering the Hellinger and Wasserstein-Kantorovich distances 


The log-entropy formulation of the Hellinger-Kantorovich distance is well suited to pass 
to the limits a —0 or /3 —)■ 0 . 

Since apart from the prefactor 1/(3 the functional only depends on (3/a, we can set 
a = 1 and consider the case /3 —?• 0. For the cost functional we obtain the expansion 

Ci^0{xo, Xi) = |xi-xo|^ 0{/3) 


uniformly on x which is compact. Hence the linear transport functional converges 
to the Kantorovich functional for the usual Euclidian cost function. Simultaneous the 
entropic terms blow up, which means that in the limit /3 = 0 , we obtain the condition T]j = 
/ij. Thus, we expect to obtain the Wasserstein distance in the limit, i.e. l-Ki_o(ho, hi) = 

W(/io,hi)- 

Keeping /3 = 4 hxed and considering a —0 we obtain 


Ca,4(3^0) ^1) 


0 for xq = Xi, 
00 for xq 7 ^ Xi- 


Thus, optimal calibration measures for a = 0 will have support on the diagonal { (x, x) G 
fixfl I X G }, such that the transport cost equals 0 and that z/ := 170 = r/i. Minimizing 
the sum of the two entropic terms with respect to u we obtain the unique solution u 
from the optimality condition = 1 and we hud l-Ko, 4 (/i 05 hi) = DHeii(hO)hi) = 

II\//^-\//^IIl2- 


3.4 Geodesic curves induced by optimal transport plans 


Let po) hi ^ M(f2) be two given measures. The geodesic curves with respect to the 
Hellinger-Kantorovich distance hK are induced by the geodesic curves in the underlying 
cone space. 

More precisely, the construction of the geodesic curve s 1 —)■ /r(s) is based on the geodesic 
interpolator Z dehned in ( 3 . 13 ): Let Aq G M2(£o) and Ai G M2(Cr2) be optimal lifts for 
Po and fii, respectively, and let 7 G M2(£oX^r2) be the associated optimal transport plan. 
Then, a geodesic curve /r(s) = S(s; hO)hi) is obtained via the projection of the geodesic 
curve for Aq and Ai in M 2 ((tQ) via 


h(s) = g(s;ho,hi) := ^-^(s) with A(s) = Z(s;-, 


(3.21) 


Note that since the optimal transport plan 7 is not necessarily unique, the geodesics in 
M(fl) are also not necessarily unique: 

Example 3.10 (i) On 12 = ]—2,2[^ we consider the measure po = <^(- 1 , 0 ) + ^( 1 , 0 ) 

/ii is the line measure concentrated in {0} x ] —1,1[. Due to the high symmetry of 
the problem, it is easy to see that there are infinitely many optimal transport plans, 
which give rise to different geodesic curves. 


(ii) Consider case of two mass points /ij = ai6x^ with |xo—Xi| = 7 r/ 2 . It is easy to see 
that in this case fi{s) = a{s)6x{s) with x(s) = (1—p(s))xo -l- h(s)xi and a{s) and 
p{s) as in Theorem 3.1 and JI{s) = (1—s)^ao5a:o + s^oi^xi are both geodesic curves. 


However, the situation is even more complicated since even along a geodesic curve... 
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Figure 4: Cone geodesic (dotted) for zq = [xo,y/a^] and zi = [xi,^/a^] compared to Hellinger- 
Kantorovich geodesic (solid) for /tq = o,q5xq and = ai5x^ in the case |xo—xi| > 7r/2. The 
Hellinger-Kantorovich geodesic consists of two parts: one part is going to the reservoir (absorp¬ 
tion), while the other one is simultaneously coming from the reservoir (generation). 


Theorem 3.11 The curve s i—)■ /i(s) defined in (3.21) is a constant-speed geodesic with 
respect to the Hellinger-Kantorovich distance hK, i.e., 


|-K(/r(s),/i(t)) = It—s|l-K(/io, lii) for all Q < s < t < 1. 


Proof: Fix 0 < s < t < 1 and let 7 G M2(CoxCf7) denote the optimal transport plan. 
We define the map via n5t(xo; ^1) = Zq, Zi), Z{t; Zq, Zi)) and introduce 

the transport plan jst = (nst)^7 whose marginals are given by A(s) and Xit), respectively. 
In particular, we have the upper estimate 


H<(/i(s),/i(i)) <W,(A(s),A(t)) < 



dffizo, zifi djst 


1/2 


However, using the definition of the 'jst and that Z is the geodesic interpolator in we 
obtain 


l-K(/i(s), /i(/:)) < |s— /:|H<(/io,/ii) for all 0 < s < t < 1. 


(3.22) 


To see that actually equality holds we use the triangle inequality and (3.22) to hnd 


H<(po,Aii) < H<(/io,/is) + l-K(/is,/i4) + 1-K(pt,pi) 

< (s + {t—s) -\- (1—t)) l-K(/io,/ii) = hK(/io,/ii). 


Thus, all inequalities are equalities, which proves theorem. 


4 Equivalence to the dynamical formulation 


In this subsection we provide the proof of part (v) of Theorem 3.6 and show the equivalence 


of the two definitions of the Hellinger-Kantorovich distance, namely the formulation via 
lifts and optimal transport on the cone space and the dynamical formulation given by the 
Onsager operator IK := ]Ki^4, i.e. Di_4 = hK with Di^4 from (3.1) and hK from (3.16). 


The proof is based on the characterization of absolutely continuous curves and their 


metric derivative with respect to hK. In particular, we show in Theorem 4.5 that each 
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absolutely continuous curve whose metric derivative is square integrable satisfies the mod¬ 
ified continuity equation in the definition of Di 4 in the distributional sense for a suitable 
vector and scalar field S and Moreover, the L^(d/i)-norms of S and ^ provide a lower 
bound for the metric derivative. 


Vice versa we prove in Theorem 4.6 that a continuous solution 1 1 —>■ /i(t) of the modified 
continuity equation for given vector and scalar fields S and ^ is absolutely continuous with 
respect to hK and the L^(d/i)-norms give an upper estimate for the metric derivative. 

Finally, Theorem |3.6K v) is proven at the end of this subsection. 

We recall that a curve [0,1] 3 f 1— )■ u{t) in a metric space (V, D) is called absolutely 
continuous if there exists a function m G L^( 0 , 1 ) such that 


D(m(s),- u(f)) < / m(r)dr 


for all 0 < s < t < 1 . 


(4.1) 


We write u G ACp(0, 1; (V, D)) if m G L^’(0, 1) for p G [1,cxd]. Moreover, among all 
possible choices for m there exists a minimal one, which is given by the metric derivative, 
see e.g. |AGS05( Sect. 1.1] 


I'uloft) := lim 

s^t 


D{u{t),u{s)) 
\t — s| 


(4.2) 


In particular, for any u G AC^(0,1; (V, D)) the metric derivative exists for a.a. t G ]0,1[ 
and satisfies ImId G L^(0, 1) as well as ImId < m a.e. in [0,1] for all m in (4.1). 

We start with a result for the regular case, i.e. the vector and scalar fields S and ^ 
are sufficiently smooth. The proof of the following result can be found in |Man07j where 
representation formulas for solutions of the inhomogeneous continuity equation 


+ div(Eii) = 4^11 


(4.3) 


based on dynamic plans are proved. We will briefly recall these results, however, since the 
cone structure did not play a role in [ManOT] we will reinterpret the results in our setting. 
In the following we understand weak convergence in the space of measures as convergence 
against bounded and continuous functions. Moreover, a curve s h-)■ /i(s) G M(f2) is called 
weakly continuous if and only if /i(s) weakly converges to p,{t) in M(f 2 ) for s —)• f. 


Proposition 4.1 f |Man07] . Prop. 3.6) Assume that E G L^(0, T; M'^)) and 

^ G C([0,T] X 12) is locally Lipschitz with respect to the spatial variable. Then, for any 
Po £ M(12) there exists a unique, weakly continuous solution t h->• p{t) of (4.3) with 
/i(0) = po- 

Moreover, for an arbitrary lift Aq G M((ln) of po the curve defined by 


A(f) = [X(f;-),i?(t;-)]#AoeM(£n), 


(4.4) 


where 1 1 —>• {X{t-,x),R{t;x,r)) is the solution of the ODE system 

X{t]x) = E(t, X{t; x)), R{t]x,r) = 2f(t,X{t-,x))R{t-,x,r) 
with initial conditions X( 0 ;a;) = x and i?( 0 ;a;,r) = r is a lift of pit). 
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Note that we can solve the equation for R explicitly and obtain 


x, r) = r exp ^2 / ^(s, X(s, a;)) dsj. 


It is well-known that if H fails to satisfy the regularity properties of Proposition 4.1 


nothing guarantees uniqueness of the characteristics t h->• {X{t),R{t)) and formula (4.4) 
does not hold. To overcome this problem probability measures concentrated on entire 
trajectories in the underlying space Co are introduced, see |Lis07j and |AGS05( Sect. 8.2]. 

More precisely, we call tt G T(C([0, 1]; Co)) a dynamic plan if it is concentrated on 
absolutely continuous curves z G A := AC^([0,1]; (Co, d^)) and if it satisfies 


'A 


dtj d7r(z) 


< oo 


with |z|c denoting the metric derivative with respect to the cone distance d(r, see (4.2). 


Note that any continuous curve t h-)■ 'z{t) = [x{t),r{t)] with t G [0,1] satisfies r G C([0,1]) 
with values in [0, cxd[. Thus, the set Or = r“^(]0, cxd[) c [0,1] is open and the restriction of 
X to Or is also continuous. The following lemma gives a characterization of the absolutely 
continuous curves in Co and their metric derivative. It is proven in |LMS15] . 


Lemma 4.2 A curve 1 1 —)■ z{t) = [x(t),r(t)] G Co satisfies z G AC^([0,1]; Co) if and only 

r G L^(0,1) and f|x| G L^(Of) for Or := r ^(]0,oo[). 

In particular, the metric time derivative is given via 

|?|^(f)^ = r{tY + r{tY\x{t)\^ fort G Of and |^|g-(t) = 0 otherwise. 

For t G [0,1] we denote by : C([0,l];Co) — t Co the evaluation map given for 
z G C([0,1]; Co) by Ctifz) = z{t). With a dynamic plan tt G T(C([0, 1]; Co)) we associate 
the curve t i—)■ X(t) := {et)^7z which belongs to AC^([0,1]; (M(Co), W^)), see |Lisn71 
Thni. 4]. Moreover, from the 1-Lipschitz continuity of the projection ^ ; M(Co) —t M(r2) 
it follows that the curve t i—)■ /i(f) := ^A(f) belongs to AC^([0,1]; (M(r2), hK)) and the 
metric derivative of fi with respect to hK satishes 


IhK(t)^ < / \z\e{tfd7z{z). 
Ja 


(4.5) 


The following theorem shows that for every absolutely continuous curve in (M(r2),l-K) 
a dynamic plan tt exists such that /i is induced by tt in the above sense and equality 


holds in (4.5). The proof is based on an extension of |Lisn71 Thm. 5] and can be found in 
ILMS151 Thm. 8.4]. 


Theorem 4.3 Let fi G AC^([0, 1]; (M(r2), hK)) he given. Then, there exists a dynamic 
plan 77 G CP(C([0,1]; Cq)) such that p.{t) = ^((et):^7r) and 


IfIh<(^)^ = [ \z\c{tf dTTfz) for a.a. t e [0,1]. 
Ja 


(4.6) 
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Using this result we can also show that all geodesic curves for the Hellinger-Kantorovich 
distance are given by projections of geodesic curves in M 2 (Cr 2 ), i.e. all geodesic curves have 
the representation (3.21). 

Corollary 4.4 (Representation of all geodesic curves) Let [0,1] 9 s i— fi{s) be a 
geodesic curve and tt the dynamic plan from Theorem Then, s h->• A(s) = (es)^ 7 r is 
a geodesic curve in with respect to W^. 

In particular, all geodesic curves in (M(r2),l-K) are given by an optimal plan 7 for 
optimal lifts of fi{0) and /i(l) in the form (3.21). 

Proof: For 0<s<f<l, we have the elementary estimates 

We;(A(s), A(t))^ < f dc{zo,zifd{es,et)#TT= [ d^{z{s),z{t)f dir 


< 


IA 


ja 

ft \ 2 

|F|(rdr ) dTT < {t—s) 



|F|^d 7 r dr, 


A 


where we have used Holder’s inequality. Since // is a geodesic curve, we have 
|-K(/i(0),/x(l)) and hence, with (4.6) we have 

Wc(A(s),A(t)) < (t-s)hK(/i(0),/x(l)) < (t-s)W,(A(0),A(l)). 


IhK = 


Arguing as in the proof of Theorem 3.11 shows that s h->■ A(s) is a geodesic curve. In 
particular, all inequalities above are equalities. 

^From the dynamic plan tt we immediately find the optimal transport plan 7 : = 
((eo),(ei)) ^77 between the optimal lifts A(0) and A(l), such that /r(s) = ^A(s). ■ 

The following theorem shows that for every curve /i G AC^(0,1; (M(r2), hK)) we can 
find a vector and a scalar field S and f such that the continuity equation in (4.3) is 
satisfied. Moreover, the L^-norm of (S,^) with respect to p,{t) provides a lower bound for 
the metric time derivative of fi. 

Theorem 4.5 Let G AC^([0,1]; (M(r2), hK)) be given. Then, there exists a Borel vector 
field : [0,1] X H —)■ such that the continuity eguation (4.3) is satisfied and 

l{t,x)\‘^ + d\f{t,x)\^ d/i(f) < |/i|H<(t)^ for a.e. t e [0,1]. 
in -I 

Proof: Let tt G CP(C([0, 1]; £ 0 )) be a dynamic plan representing /r according to Theorem 


4.3 We denote the lift X(t) = {et)#7z. Due to the Disintegration Theorem, see |AGSn5( 


Thm. 5.3.1], there exists a family of probability measures 7r^(t) G CP(C([0,1]; C^)) for A- 
a.e. X G £0 and each t G [0,1]. Moreover, 7iz(t) is concentrated on the subset Az(t) : = 
[z e A ■. z{t) = z} and for every F G L^(C([0,1]; Cq); tt) we have 


F{z)d7r{z) = 


IA 


€n '•JA,{t) 


F{z)d7iz{t))dX{t). 


For f G [0,1] and 2 ; = [x,r] G Cn \ { 0 } we define the vector fields 


l{t,z)= / x{t)d'Kz{t) and f{t,z) = - 
JAzit) ^ 


'Azit) 


rjt) 

r(r) 


d7r^(f). 
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while for 2 ; = 0 we set ^{t,z) = = 0. Due to Jensen’s inequality we have the 

estimate 


'£n 


+4k(t,^)| rMA(t) < 


r(t)^|a;(t)|^ + |r(t)|^ d7r2(t) dA(t) 


J€n JA,(t) 

= [ |z|e;(t)M7r = |/i|H<(t)^. 


'A 


To obtain vector fields S and ^ on D we employ the Disintegration Theorem for r^A and 
/i = 11:^ (r^A) to obtain a family of probability measures concentrated on [0, oo[ and 
such that r^A = z/a./i. Using again Jensen’^inequality we easily check that the fields 
E{t,x) = and ^{t,x) = z)diy^ satisfy 


'n 


•(f,a;)p+ 4|^(f,a;)p d/r(f) < |/i|H<(t)^ for a.e. f e [0,1]. 


It remains to show that the continuity equation (4.3) is satished. For this, we choose 
a test function of the form (p{x,t) = ri{x)‘ip{t) with r] and "0 Lipschitz and bounded with 
compact support in D and ]0,1[, respectively. We compute 



0 Jn 


ri{x)-ip{t)diJ,{t)dt = / / -ip{t)ri{x{t)^r{ty dtdiT 


A Jo 



r{t) 


A Jo 


'tn Jo 
1 



O Jo 


i/j{t) 'Vri[x{t))x{t) + 4:ri{x{t)) r(f)^dfd7r 

'iplt) [V'r]{x) ■ E{t, z) + r]{x)^{t, 2 :)]r^dA(f)dt 
'ipit) [Vr]{x) ■ E{t, x) + 7]{x)^{t, a;)] d/i(f) df. 


Thus, the continuity equation is satisfied in the distributional sense. ■ 

Next, we show the reverse implication. 

Theorem 4.6 Let t 1 —)■ /i(f) be a narrowly continuous curve in M(D) and suppose that 
there exists a Borel vector field S : [0,1] x D —)■ and a scalar field ^ : [0,1] x D —)■ M 

satisfying {E{t),^{t)) G L^{dn{t);R^+^) 

j (^|S(f,a:)|^+ 4|^(t,a:)|^j d/i(t) dt < cx) 

such that the continuity equation (4.3) is satisfied. Then, /i G AC^([0,1]; (M(D), hK)) and 
ftf < j (^|S(t,a;)p+ 4|^(f,x)|2) d/i(t) for a.e. t e]d,l[. (4.7) 




Proof: Let /i, S, and f be given as in the statement of the theorem. Due to Lemma 
3.10 in [ManOT] for e > 0 we can obtain sufficiently smooth approximations /Xg, Sg, and 
fs satisfying the continuity equation (4.3) and converging in a suitable sense to /a, S, and 
respectively. 
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Moreover, /i^, and satisfy for any convex, nondecreasing fnnction -0 : [0,oo[ —>■ 
[0, oo[ we have the nniform estimates 


^{\^e{t,x)\)dfie{t) < / 'ip{\E{t,x)\)dfi{t) and 


/ i^i\^eit,x)\)dfi,{t) < / ip{\^{t,x)\)dfi{t) 

'n Jn 


(4.8) 


Applying the representation resnlt in Proposition |4.1| for /i^, and we obtain the 
formnla 

/r,(t) = ^A,(t), A,(t) = [X,(t; ■), ■)]#A,(0), (4.9) 

where A£(0) is a lift of /ie(0) and and are the maximal solntions of 


X,(t;x) = S^{t,XJt; x)), RJt;x,r) = 2((t, X,(t; x))R,{t; x,r) 


(4.10) 


snbject to the initial conditions = x and Re{0;x,r) = r. With this we dehne the 

map C([0,1]; Cq) via 

4)£([x,r]) = [Xe{t;x),Re(t;x,r)]), 

and introdnce the dynamic plan G M(C([0, 1]; as = (<he)^A£(0). 

We aim to show that the seqnence is tight snch that we can hnd a snbseqnence 
(not relabeled) that is narrowly converging to a dynamic plan tt. Indeed, nsing Lemma 


4.2[ (|4.10|) and the representation formnla (4.9) we have for 0 < to < < 1 that 

r rti 



|^lc^W^dtd7r^(z) = [ I 

IA Jto ACn -^*0 


'£n Jto 
rti 


Re{t-,x,ry\X^{t,x)f + Re{t-,x,ry ^ dt dAe(O) 


,{t,xy\^ + 4\ut,xynRldt dA,(o) 


'to J Q 

rti 


< 


^ to «/ 


+ 41^6(0 ( t ) I \ diJ.eit)dt 


l{t,x)f + 4|^(t,a:)|^ \ d/i(t)dt < cx). 


Since the fnnctional £ i—)■ |^||^ dt has compact snblevels in {z G C([0,T]; €n) \ P(0) = 
[a;,r]} we have shown the tightness of and we can extract a snbseqnence narrowly 
converging to a limit tt in M(C([0,1]; Co)). Moreover, dne to the lower semicontinnity of 
z H->■ Jq |P||^ dt we immediately obtain 



tl 


A Jto 


dtd 7 r(£) < CX). 


Hence, tt is concentrated on absolntely continnons cnrves. 

It remains to show that tt is a dynamic plan for /i, i.e., /i(t) = ip((ei)^7r) from 


which (4.7) follows. We easily show that dne to the constrnction of we have that 


^((e^^TTe) = Heit)- The claim now follows from the continnity of the map tt h->• ^{{et)#TT 
with respect to the weak convergence in M(C([0,1]; Co)). ■ 
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Finally, we can prove the equivalence of the dynamical and the transport plan formu¬ 
lation of the Hellinger-Kantorovich distance. 


Proof of Theorem 3.6(v). 


We want to show that for all /xq, Pi G M(f2) we have 


Pi) — Di,4(/ro,/^i)- 


Due to Theorem 4.6 we have for any curve t h->• p{t) connecting po and pi and satisfying 
the continuity equation (4.3) for a vector and scalar held S and respectively, the upper 
estimate 


PiY < 


IhK 


(f) dt < 



+ 4:^{t,xY \dp{t)dt. 


'0 Jn 


Thus, by minimizing over all absolutely continuous curves 1 1 —)■ p(t) connecting po and pi 
we have l-K(/io,/Ui) < Di,4(/io,hi)- 

To show that equality holds, we consider a geodesic curve s h-)■ p{s), which is obviously 
absolutely continuous with respect to hK and we have = bK(ho,hi)- By Theorem 
El we then have 

H<(ho,hi)^ = ^ \p\^K{tfdt> j ||S(t,a;)|^+ 4^(f,a;)^|d/r(f)df. 


Hence, we have proven Theorem 3.6[v). 


5 Geodesic curves for hK 


In this section, we provide some general results on geodesics as well as a few illuminating 
examples. Section 5.1 deals with the geodesic A-convexity of functionals T : M(f2) 


-)■ 


MU{oo}. In particular, we show that the linear functional : /i h- )■ f^^(x) dp(x) is 
geodesically A-convex if and only if the function [x,r] i—)■ r^<F(x) is A-convex on (£Q,d£). 
In Section 5.2, we return to the problem of moving the measure po = agdy^ to pi = aidy-^ 


and show that in the case \yi—yo\ = 7r/2 there is an inhnite set of geodesic curves. 
Moreover, we show that all these curves are indeed solutions of the formally derived 
equation 

4^ + div PVO = 4J;,, l{+l|VJp + 2J" = 0, (5.1) 


ds 


where f is in fact the same for all geodesic connections. In Section 5.3 we discuss geodesic 


curves that are induced by dilation of measures. Section 5.4 discusses how the geodesic 


curve connecting po = y;[o,i] and pi = Xl 2 , 3 ] can be constructed. 

Finally, Section 5.6| shows that (M(f2),l-K) is not a positively curved (PC) space i 
the sense of Alexandrov (cf. |AGS051 Sect. 12.3]) if D is two-dimensional. 


m 
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5.1 Geodesic A-convexity for some functionals 

Here we give some first results of geodesic A-convexity for functionals 5” : M(f2) —?• 
MU{oo}, which is defined via 

Vgeod. curves /i : [0,1] —)■ M(r2) : 

JMs)) < (l-s)?(,.(0)) + s?(^(l)) - A A^rt<(^(0), ^(l))l 

We hrst provide an exact characterizations of A for linear functionals, then give some 
preliminary results and conjectures for nonlinear functionals. 

It is well-known that in the case of the Wasserstein distance, geodesic A-convexity 
of functionals of the form = f^^(x) dju(x) is satished if and only if x i—)■ <h(x) is 

A-convex, see [AGSOhl Sect. 9.3]. Hence, it is natural to ask whether the same can be 
said for the Hellinger-Kantorovich distance hK and geodesics in the cone space. 

We start with a very easy relation for the total mass along the geodesic curves. It 
turns out that the mass depends on the parameter convex and quadratically. 


Proposition 5.1 Consider a geodesic curve [0,1] 9 s h->■ /x(s) given by (3.21) and set 


m{s) := |/i|(s) = / d/i(s). 

Jn 


Then, we have 

m{s) = (1—s)^m(O) -I- s^m(l) -|- 2s(l—s)m* 

withm^:= / roricos^(|xi-xo|) d7([xo, Tq], [xi, rij), 

< (l-s)^m(O) + s'^m{l) < m{s) < ((l-s)V"i(0) + s^,/m{l) 

for all s G [0,1]. Moreover, m'\s) = 2l-K(/io, /xi)^ > 0 which implies 
m{s) = (1—s)m(O) -|- sm(l) — s(l— s)l-K(/io, 


(5.3a) 


(5.3b) 


(5.4) 


Proof: Using the dehnition of /i(s) via the projection and Z{s] •, ■) in (3.13) we have 


m{s) = / d/r(s) = / rMA(s) = / rM(Z(s; •, •)#7) = / R{s]ZQ,Zif d'^{zo,Zi). 


Now, we can use the explicit quadratic structure of given in (3.13) we hnd the quadratic 


formula (5.3a). 


For estimate (5.3b) we use the fact that cos7r(|xi—xo|) takes values only in the interval 
[0,1] on the support of 7, see Proposition 3.4[ c). By the Cauchy-Schwarz estimate we 
have 0 < m* < ■\/m(0)m(l), which implies the estimates. 

Obviously, we have m"{s) = 2(m(0) -|-m(l) — 2m*), and comparing to the characteri¬ 
zation (3.20) we hnd m"[s) = 2l-K(/ro,/ii)^ as desired. ■ 


Next, we consider the linear functional A$(/i) = <I)(x)d/i(x) with 4) G C°(r2). 
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Proposition 5.2 Let $ G C°(f2) be given and define <l>([x,r]) = r‘^^{x). Then the func¬ 
tional J”,!, : M(f2) —)■ M is A-convex along [0,1] 9 s h-)■ /i(s) given by (3.21) if and only if 
$ : Cn —)■ M is A-convex. 


Proof: Assume that <1> : Co —>■ M is A-convex. We use the dehnition of s h- >• u(s) in 
(|3^ to hnd 

3'<i.(/i(s)) = [ r2<h(x)dA(s) = [ $(Z(s; •, •)) d7, 

Jn Jtn 

where 7 G M 2 (CoxCf 7 ) is an optimal plan for /tq = /^(O) and /ii = Thus, with the 

convexity of $ and the optimality of 7 we find 

T#(/i(s)) < (l-s)T<i,(/io) + sT<i,(/ii) - ^s(l-s)l-K(/io,/ii)^. 


Conversely, if is A-convex on M(f2) we can consider geodesic curves for two Dirac 
measures Hq = aoh^o ^ind Hi = to obtain convexity of $ along geodesics in C^- 

However, as transport above the threshold 7 r /2 is not optimal, this excludes geodesic 
curves in Cq for distances 7 r /2 < Ixq—X il < vr. In this case, we note that we can always 
reduce this case to two overlapping geodesic curves for distances below n/2. ■ 

Finally, we provide some negative results for functionals that are geodesically A-convex 
for the Wasserstein-Kantorovich distance but not for the Hellinger-Kantorovich distance. 
A simple necessary condition is obtained by realizing that for all /xi G M(D) the Hellinger 
geodesic 

/i^(s) := yUi 

is also the unique geodesic in (M(D), hK) connecting /xo = 0 and ni. Indeed, this easily 
follows from the fact that the possible lifts of po are given by ado ^ At 2 (£o) with a > 0 . 
However, geodesics in (£n,cl(>;) connecting 0 and Zi = [x, r] are simply given by z{s) = 
[x, sr]. 

Applying this to Boltzmann’s logarithmic entropy £ ; /r h->■ FB(d/i/dx) dx, we see 
that it is not geodesically A-convex with respect to hK. For this, consider the geodesic 
/i(s) = s^udx, where u G L^(D), u > 0, and |/i| = f^udx > 0 to find the relation 


£(/r(s)) = s^£(Mdx)-h ( 1 —s^) / Idx-h 2 s^ logs / udx. 

Jq Jn 

Clearly, the last term destroys geodesic A-convexity. Similarly, for p G ]0,1[ U ]1, cx)[ we 
may look at functionals of the form 

W = (5-5) 

Along the geodesics /i(s) = s^u dx we obtain ep(s) := = s^^£(/i(l)). For p G 

]l/2,1[ we conclude 6 ^( 5 ) —)■ —00 for s 0 due to 6^(1) < 0. Hence, for these p the 

functional £p is not geodesically A-convex for any A G M with respect to hK. 

The following remark supports the conjecture that the functional £p is geodesically 
convex on (M(D), hK), or more generally on (M(D), Dq,^^) for all p > 1. It is based on 

the formal differential calculus developed in |LiM13j . which was in fact the stimulus of 
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this work. If this is the case one may consider the geodesically A-convex gradient system 
(M(r2), tp+Jq,, which corresponds to the partial differential equation 


dfU = —]Kq,^^(m)D(£p+ 5'$)(«) = + div (MV<h) j — /3(^^u H— 

complemented by the no-flux boundary conditions -|- $) ■ z/ = 0. Note that 

this equation always has the solution m = 0, which is different from the unique mini- 
mizer Uiain of £p-|-3^#, if d> attains negative values somewhere. Indeed, we have Wmin : = 


max{0, —*h}j We refer to PQV14[ Eqn. (2.1)] for an application for modeling 

of tumor growth. 


Remark 5.3 (Geodesic convexity via Eulerian calculus) Following ideas in jOtWOdl 

IDaSOSj/ a formal calculus for reaction-diffusion systems was developed in ILiMlSf . The idea 
is to characterize the geodesic A-convexity of 8,{udx) = f^E(u(x)) dx on (M(r2),Da_^) 
by calculating the quadratic from M{u,-) generated by the contravariant Hessian of E: 

M(m,0 = with V{u) = K„,^(m)D£(m). 

Then, one needs to show the estimate > A{^,Ka,/ 3 (u)^}. 

Following the methods in jLiMlSl Sect. 4], for u G C°(r2) and smooth ^ we obtain 

M(u,C ^ £{x((A(u)-H(u))(A(y + H(u)jDg\"J 

+ aZl(Bi(«)|V{|" + + XB 3 (u)Cj dx, 


where A{u) = u‘^E''{u), H{u) = uE'{u) - E{u), Bi{u) = - E{u), 

?/ 

B 2 {u) = —2u^E"{u) -|- uE'{u) — E{u), Bs{u) = u^E"{u) + —E'{u). 
For the special case E(u) = FF j (p—1) with p > 1 we find the relation 


=/ a"((p-l)(Af)2+|D2^p) +a/)(|E||V^|2-(2p-l)^Af) +/) 


2 2p^-p c2 
2p-2 


u^dx, 


which is nonnegative, because the mixed term can be estimated via 

-a/3(2p-l)CAC > -a\p-l)(ACf - 


Thus, the formal Eulerian calculus suggests that £p is geodesically convex with respect 
to hK for all p > 1. This investigation will be continued in subsequent work. 
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5.2 Geodesic connections for two Dirac measures 


While for the characterization of the distance hK the choice of the geodesics is not so 
relevant, we want to highlight that the set of geodesic connections between two measures 
can be very large. As shown in Section ( 3.4[ ) and Corollary 4.4 all geodesic curves can be 
constructed from optimal couplings 7 G M 2 ((tQ x £n) in (3.16). However, many 7 lead 
via the projection ^ to the same geodesic curve. Here we show that the set of geodesics 
may still form an inhnite-dimensional convex set. 

We treat the case of two Dirac measures ajSyj. The case \yi—yo\ 7 ^ 7r/2 is trivial, since 
only one geodesic connection exists. However, for the critical distance \yi—yo\ = vr/2 an 
uncountable number of linearly independent connecting geodesics exists, such that the 
span of the convex set of all geodesics is inhnite dimensional. 

We consider po = ao^yo and pi = ai6y-^ with oq, oi > 0. As was shown before there 
is exactly one connecting geodesics if \yo—yi\ ^ I Indeed, for \yQ—y\\ < 7r/2 we have 
/i(s) = a{s)5x(s) as discussed in Section 3.1 For \yo—yi\ > 7r/2 we have a pure Hellinger 
case with /i(s) = (1— 

For the critical case \yQ—yi\ = 7r/2 we have a huge set of possible geodesics, since we 
may consider all lifts Aq, Ai G M([0, cx)[) satisfying 


ttj — 


[0,oo[ 


r^dAj(r) for j = 1,2 and Ao([0, oo[) = Ai([0, oo[). 


Now every coupling 7 G F(Ao, Ai) = {7 G M([0, C)o[^) | n^7 = A*} provides an optimal 

coupling. To see this, we set Xj = 5y^ ® Xj G M(Cn) and 7 = 5^,0 *^ 2/1 7 ^ M((tnx£n)- 

Since \yo-yi\ = t^I‘^ implies d£([|/o,ro], [yi,ri]f = rg + rf, we hnd 


d£([xo,ro], [xi,ri])M7 = 


'CfixCn 


[0,oo[' 


[0,oo[' 


d€([2/o,?^o], [yi,ri]fdj 
{ro+rf) d7 = ao + Oi = l-K(aohyo) 


Now, geodesic curves can be constructed for every 7 as defined in (3.21). Obviously the 
set of all possible pairs (Aq, Ai) is convex and therefore also the set of all 7 G F(Ao, Ai). 
Hence, the set of all optimal 7 is convex. 

However, the mapping from 7 to /x(-) is not surjective, since there is a huge redundancy. 
Indeed, by the definition ^ Z{s; •, we have 

/ ^p{x)d^^s{x) = / i?(s, [|/o,ro], [2/i,ri])2'0(X(s, [|/o,ro], [i/i,ri]))d7(ro,ri), 

Jn 7[o,oo[2 

where, using \yo—yi\ = 11 / 2 , we have i?(s, [j/o,ro], [yiiT\\Y = ( 1 — s)^rg + and 


[yo,ro], [ 2 / 1 , ri]) = {l-p{s))yo + p{s)yi with p{s) = - arccos 

TT 


1 + 


sri 


- 1/2 


.(l-s)ro^ 

see (3.13). The observation is that the integrand can be written in the form r^<F(s, ro/ri). 
In particular, for ri > 0 the two geodesics 

A(s) = (j_R(s;[ 3 /o,r’o],[j/i,ri]) ® <^X:(s;[yo,r'o].[ 2 /lVi]) 

A(s) — ^rii?(s;[j/o,'ro/fi],[j/i,l]) ® ^X{s-,[yo,ro/r\\\y\,1]) 
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on (Eq give rise, via the projection to the same geodesic on given by 

h'('S) <^X(s;[i/oVo],[j/l,''’l])' 

Thus, for a coupling 7 G r(5yg®Ao, Sy^ ® Ai) given by 7 G r(Ao, Ai) we can dehne the 
normalization 7Vo7 G with respect to s = 0 as follows (iVi 7 for s = 1 can be 

dehned similarly): 


<h(^o,2:i)diVo7 = 


'CnxCn 


/ rl^{[yo,ro/ri], [yi, 1]) d7(ro,ri) 

^]0,oo[x]0,oo[ 

+ <h(o, o)bo,o + <h([|/o, 1], 0)60 + *h(o, [yi, l])bi, 


where := 7({(0,0)}), bo := r2d7(ro,0), 61:=/ r2d7(0,ri). 


' ]0,oo[ 


' ]0,oo[ 


The terms involving bj contain the trivial Hellinger terms, where mass is moved into or 
generated out of the tip 0. Note that this mass can be concentrated to the hxed value 
ro = 1 or ri = 1 , respectively. The second term gives the mass that simply stays in 0. 
The interesting part is the hrst one, where still a measure no7 survives: 


' ]0,oo[x]0,oo[ 


rj^i[yo,ro/n],[yi,l])d^{ro,ri) =: / $([|/o,r], [|/i, 1])d(no7)(r). 


' ]0,oo[ 


We will see below that (no7)(dr) gives the mass that leaves yo with speed 1 /r. 

It is easy to see that 7 and iVo7 generate the same geodesic curve In terms of 
no7, we can now write the geodesic curve associated with 7 in a simpler form, namely 

/ ij{x)dys{x)= / ((l-s)V2+s^)^/>((l-p(s,r))|/o+p(s,r)|/i)d(no7)(»^) 

Jq J]0,oo[ 

+ 0 + {l-sfboijiyo) + s^bi-ipiyi), 

1 /Q 

where p{s,r) = ^ arccos [l + ^ ] 0 ) 1 [ for s G ] 0 , 1 [ and r > 0 and 

d(no7)(^^)- 


00 = ^0+ / ^"^d(no7)(r) and ai = bi + 

J]0,oo[ 


]0,oo[ 


To simplify the further notation we now assume hi = [— 2 , 2 ], yo = 0 , and yi = ti/ 2 . 
By the definition of p we hnd x = x{r,s) := {l—p{s,r))yo+p{s,r)yi if and only if r = 
(s cosa;)/((l—s) sinx). Now differentiating dps{x) with respect to s we hnd 

^ [ ^/J{x)dpsix) = -2{l-s)bo^/J{0) + 2sbl^/J{^T/2) 

Cl’S J[0,it/2] 


+ 


/]0,oo[ 


2(5—( 1 — 5 )r^)'^(S'(r, 5)) + (( 1 — 5 )^r^+ 5 ^)'^'(x(r, 5))(95x(r, 5) j d(no7) 


Dehning the function ^ and S explicitly via 


= ^2g(l-g) ^ 2(s,a;) = d^^{s,x) 


( 5 . 6 ) 
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Figure 5: Geodesic curve s i—)■ ;u(s) connecting superposition of Dirac masses and single Dirac 
measure in Example |5.4[ i) for different values of s. Here green denotes the measure fii for 
N = 30 while the blue and the black parts correspond to the parts of ^(s) that lie below and 
above the threshold 7r/2. The orange curves are the transport lines. 


and eliminating r in the above integral via r = (s cosa;)/((l—s) sinx) we obtain the 
identity 


d 

ds 


— / 'ip{x)dfis{,x) = 
'[0,V2] 


'[0,7r/2] 


4,^(s, x)i>{x) + S(s, x)ip'{x)) dfis{x). 


Since ^ satishes the Hamilton-Jacobi eqnation + 2^“^ = 0, we conclnde that 


the pair (/r, 0 indeed satishes the eqnation (5.1). 


It is interesting to note that ^ is independent of the measnre no 7 . All the information 
abont the precise form of the connecting geodesic is solely encoded in the information how 
the singnlarity for s \ 0 and s y' 1 are formed, and this information is exactly contained 
in no 7 . 


5.3 Dilation of measures 

For the Kantorovich-Wasserstein distance there is a geodesic connection between any 
measnre jui and the Dirac measnre juq = /j,i(Q)dyg by radially dilating the measnre, viz. 

/iKaWa( 5 ) = where ^ (l-s)yo + sa;. 

This dilation corresponds to the solntion ^{s,x) = \\x—yy/s of the standard Hamilton- 
Jacobi eqnation = 0, and ^y{s) + div (/iV.^) = 0. 

A possible generalization of this dilation to the Hellinger-Kantorovich distance is given 
by a solntion ^ of the modihed Hamilton-Jacobi eqnation -g + 2^ = 0 having 

the form 

.^(s, x) = with |VC(a:)P + 4((xy — 4((x) = 0. 
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Figure 6: Geodesic curve s i—)• /i(s) connecting line measure and single Dirac measure in Example 
|5.4| ^ii) for different values of s. Here green denotes the measure //i while the blue and the black 
parts correspond to the parts of fi{s) that lie below and above the threshold 7r/2. The orange 
curves are the transport lines. 


The trivial solutions C = 0 and C = 1 correspond to constant and pure Hellinger geodesics, 
respectively. However, there are many other solutions, e.g. 


C(a;) 


(sin(|a;|))^ for |a;| < 7r/2, 
1 for |a:| > tt/2; 


and C(a;) 


mm{Cix^-yo) I = 1,... ,d} 


if IVo-Vol > tt for all j ^ k. 

Staying with ^ ^ we see that an arbitrary measure can be connected to /xo = o-o<^o, 

with oo > 0 hxed, by the geodesic connection ju(s) = Hs given via 


/ '0(a;)d/i^(a;) = / d/ii(a;) 

/ £7 J Q.r\{\x\'>'K/2} 


+ I ((1—s^)(cos |a;|)^ + s^)'0( arctan [stan |a;|] 1 —r)d/ii(a;), 

Jnn{\x\<TT/2} ^ HK 

where the hrst term on the right-hand side denotes the pure Hellinger part, while the 
second term involves the concentration into Oo^o for s \ 0, where the total mass at s = 0 
equals Oq := 

Again it is easy to show that the pair (/r, 0 with ^{s,x) = ({x)/{2s) satishes the 


formal equation (5.1) for geodesic curves. We also note that the dilation operation is 
unique, even if /xi has positive mass on the sphere {|x| = vr/2}. This is because of the 
hxed function Of course there might be other geodesic curves connecting /xo = 
and /ii, e.g. for /xi = aiSy^ with \yi\ = 7r/2, where we have all the solutions constructed 
in Section! 


Example 5.4 (i) As a more concrete example, we consider in Q = [0,1] x [0,2] the 
measures pi = N > 2, and /xq = ao<^ 0 ; where 


for k = 1,..., N, and oq = cos(|a;fc|)^. 

k:\xk\<n/2 


Xk = 




k-1 fO 
^ N-l[2 
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Figure 7: Supports of geodesic curve ^(s) in Example 


5.4 


ii) for different values of s. 


Using the formula above, we obtain the geodesic connection 

k:\xk\>TT/2 k:\xk\<TT/2 


where Pk{,s) 


\Xk\ 


arctan [stan |xfc|]. 


The geodesic connection p{s) is depicted in Figure^for different values of s. 

(ii) Similarly, we can compute a geodesic connection fi{s) for the line measure pi = 
0 'C^|[o, 2 ] which is collapsed into the measure po = oo^o with oq = /g^(cos(|/))^ d?/. In 
this case, p,{s) is concentrated on the set given by the function 


X{s; x) 


p{s;x)x, for |x| < 7r/2, 
X otherwise, 


for s G [0,1] and x G supp pi, 


where p{s-,x) = arctan(stan |a;|)/|a;|. On these curves the density with respect to the one¬ 
dimensional Hausdorff measure is for y E fl and Xs{x 2 ) := X(s; ( 1 , 0 : 2 )) for X 2 G [0,2], 
given by 


a{s,y) 


Q(g;-) 


oX:\y) 


where the profile a reads 


a{s; x) 


(1—s^)(cos |a:|)^ + /or |x| < 7r/2, 

otherwise. 


The curve p{s) is shown in Figure^ 


5.4 Transport of characteristic functions 

Here, we discuss a method to explicitly construct the geodesic connection between two 
characteristic functions pj = a^Xj^ieft ,,,rightj where C M^. However, to simplify notations 
we will restrict to the specihc case 

Pq X[—7r/4,7r/4] do: and P\ X[7r/2,7r]do:. 
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Obviously, we have the Hellinger parts /Xg = X[- 7 r/ 4 ,o]da; and /ii = X[ 37 r/ 4 , 7 r]da;, which 
are absorbed and generated, respectively, without any interaction with the transport in 
between. 

To construct a transport geodesic from fiQ = X/gdx and /if = X/^do:, with Iq = ]0, 7 r/ 2 [ 
and Ji = ] 7 r/ 2 , 37 r/ 4 [, we find the functions rj : —)■ M via minimizing the entropy- 
transport functional /ig,/ii). We establish the calibration measure rj via a map 

h : Jg —Ji in the form 


T(a:o, a^i) dr/(xo, a:i) = / '^{x,h{x))f{x)dx. 


' loxh 


'lo 


Checking the marginal conditions If-^r/ = Qjdx we find f{x) = po(a^) = Qi{h{x))h'[x). 
Moreover, the optimality conditions in Theorem |3.7| give, for all a; G Jo and // G Ji, 

Po(a;)pi(h(a;)) = [cos(h(a;)-a;)]^ and Po(a^)^’i(l/) > [cos(//-x)]^ (5.7) 


Deriving the hrst-order optimality conditions aX y = h{x) from the second relation in 
we hnd 

2 sin(h(a:)—a:) cos{h{x)—x)h\x) = Qq{x)Q i{h{x))h'{x) = Qo{x)q'q{x) = -(po(a^)^)^- 


Since the first relation in (5.7) has the form Qo{xY = h'{x) [cos(/i(a:)—a;)] we find 
h'\x) = 2{h'{x)"^+h'{x)") tan {h{x)—x), h(0) = 7 r/ 2 , h( 7 r/ 4 ) = 37 r/ 4 , 


which has a unique monotone solution. Indeed, to see this let h{x) = 7 r /2 -|- a: — w{x), 
where now tc(0) = tc( 7 r/ 4 ) = 0 and w > 0. Then the ODE reads w" = b{w')c{w) 
for suitable functions b and c. Rewriting it in the form w'w"/b{w') = c{w)w' we hnd 
A{w{x)) = C(w(x)) + j, where B'{y) = y/b{y) and C'{y) = c{y). An explicit calculation 
and exponentiating both sides yields 


^Jl—w'{x) 

2—w'{x) 


c*sintc(t), tc(0) = tc( 7 r/ 4 ) = 0. 


Solving for w' we hnd w' = g±{c^smw) with g±{a) = (4a^ —1 ± \/l—/(2a^) for 
0 < a < 1/2, where ±g±{a) > 0 for a 7 ^ 1/2 and g±{l/2) = 0. Thus, w will have a unique 
maximum tc* = tc(ic*) with c* sintc* = 1 / 2 , and tc* can be determined uniquely from 

dw 7"'* dtc 

9+M Jo -9-{w) 

dtc = K{w^) sinw^, 

where K is the elliptic K function. Numerically, we hnd w* = 0.4895 and thus c* = 1.0634. 

Now it is straightforward to show that there is exactly one c* > 0 such that a solution 
with w{x) >0 and w'{x) < 1 exists. 


TT 

4 


/‘7r/4 

/ dx+ dx = 

'0 Jx, Jo 

r* sinw* 

/o \/ (sinry*)^ — (sintc)^ 
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Figure 8: The densities f{s, •) of the geodesic curves fi{s) = f{s, •) dx connecting /tq = 
X[-7r/4,7r/4]dx and m = X[7r/2,7r]dx for s = 0.01, 0.1, 0.2, ..., 0.7. 


Based on the function h the densities and are explicitly known and we may write 


the geodesic connection ps = h(’S) for fiQ = X/gdo: and as in (3.21) 


/ i’{y)df^s{y)= R{s,xf'ip{Y{s,x))go{x)dx 

Jl0,3ir/4] JIq 

with R{s, xY = (1—s)^rQ(a;) + s^r‘l{h{x)) + 2s(l—s) 

, , /(l-s)ro(x) + sri(h(x)) cos(h(a;)-x) 

and F(s,|/) = X + arccos ' ^ ^ ^ ^ ^ ' 


R{s, x) 

where rj{xj) = Thus, the density /(s, •) of /is satishes the relation 

i'Y(s,x) ^37r/4 

/ f{s,y)dy= / X{y<Y{s,x)}{y)dys{y) 

Jo Jo 

= / i?(s,a;o)^X{y<Y(s,x)}(fo(s,a:o))^'o(a^o)dxo = / i?(s,a;o)^p(a:o)dxo. 

Jlo Jo 

Differentiation with respect to x gives the explicit formula 

^(s,a;)2po(a;) 

f{s,Y{s,x)) = -—. 

dxY{s,x) 

In Figure we plot the densities together with the corresponding Bellinger parts. 

5.5 Towards a characterization of all geodesic connecting two 
measures 

Here we discuss the question of describing all geodesic curves for two given measures /iq 


and /ii. As we have seen in Section 5.2, the set of all these curves can be very big, in fact 
even inhnite dimensional. The hnal aim would be to dehne a geometric tangent cone in 
the sense of |AGSn5| Ch. 12]. 

The major tool in understanding the structure of all geodesic connections is Corollary 


4.4, which states that all geodesic curves s h-)■ /i(s) are given as projections /i(s) = ^A(s) 
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of geodesic curves A in M 2 (£r 2 ). Thus, writing (3.16) more explicitly via 


|-K(/io,/ii) = min I [[ 6c{zQ,Zif = ho, =/xi |, (5.8) 

'■ A A Co X Co ' 


'CnxCn 

we can define the set of optimal plans via 

Opf^(h0,hl) '■= \ 1 ^ 3Vf2(£QX(tf2) 


7 is optimal in (5.8) 1, 


which is a convex set. Every optimal plan 7 gives rise to different geodesic A(s) = 
Z{s] ■, •):^7 in M 2 (£o)- While for different 7 the geodesics A are different, the same is no 
longer true for the projections /i(-) = ^A(-). 

The major redundancies in the set Opt^(/io,/ii) of optimal plans is seen through the 
scaling invariance given in relation (3.18). If 7 contains a transport of mass m along 


a cone geodesic connecting [a:o,^"o] and [a:i,ri], then the contribution to the projection 
/i(s) = ^A(s) is equal to a transport of mass along the cone geodesic connecting 
[xo,ro/'d] and [xi,ri/'d] for all h > 0. Thus, we can dehne a normalization operator N 
action on plans 7 , that does not change the projection. 

For this we consider the partition of given by the sets 0, 0 ) 3 , ^ 1 , *^ 2 , and 0o: 


0 := |([a:i,ri], [ 0 : 2 ,r2]) G 0^X0^ : > 0, [xi - X2\ < 7r/2|, 

0)2 := |([xi,ri], [ 0 : 2 ,r2]) G 0^X0^ : rir2 > 0, \xi - X2\ > 7r/2|, 

•S'l := |([a:i,ri],o) ; ri > o|, 02 := |(0, [0:2,r2]) : r2 > o|, 0 o ;= {o}x{o}. 

With these sets we dehne the scaling function 


'&{,Zi,Z2) 


( / \ 1/2 
( rir 2 cos(|xi - X 2 \)] 

< n 

r2 


if (^1,2:2) e 0 , 
if (^1,2:2) G 0 j ^2 U 'Sj, 
if (^1,2:2) e 02 , 
a zi = Z 2 = 0 


and employ the dilation map from (3.17) to generate the corresponding rescaling 
N : M 2 ( 0 f 2 X 0 n) -)■ M2(0 oX0q) by iVy := (hi,)jj(7/^7 ) L(0oX0q \ 0q), where L denotes 
restriction of a measure. bBy the dehnition of ^ and the scaling property of we hrst 
hnd q?n ^#7 = and ^i)^d7 = IlcaxCn zifd(N^). Hence, for 

each 7 G Opt^(/io, /ii) we again have Nj G Opt^(/io, Mi)- Thus, we dehne the normalized 
optimal plans via 


NormOpt^(/io,/ii) := j 7 e Opt^(/io,/ii) 


7 = Nj 


(5.9) 


which is a much smaller but still closed and convex set. 

Using the scaling properties of the geodesics on 0^, which are given via the interpolat¬ 
ing functions Z = (X, R) as /i(s) = ^Z(s; ■)#'y (cf. ( |3.21[ )), and the fact that depends 
on xi,X 2 only through their distance |xi—X 2 I, we also see that 7 and N-y generate the 
same geodesic, viz. 


M^{s) := q?(Z(s; -, •)#)7) = ^{Z{s] -, ■)#(^7)) = MN^jis). 
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With these preparations we are able to prove that there exists a unique geodesic con¬ 
necting po to fii if fiQ is absolutely continuous with respect to the Lebesgue measure. More 
precisely, we show that in this case N eliminates all redundancies: NormOpt^(/io, /ii) con¬ 
tains only one element and N'j characterize the geodesic connection of po and uniquely. 


Theorem 5.5 For every couple /io,hi M(r2) with /io absolutely continuous w.r.t. the 
Lebesgue measure, there exists a unique geodesic pi connecting po to pi and a unique 
7 G NormOpt^(/io,hi)- particular, for each geodesic curve pi connecting piQ to pit 
there exists a unique 7 G NormOpt^(/xo, hi) that pi = pi^. 

Proof: By the above discussion we have just to check the uniqueness of 7 . We first 
show that any 7 G NormOpt^(/io, hi) does not charge 0^2- 

Since 7 is optimal, the rescaling given by N and |LMS15i Thm. 7.21] yield that 

= 1 (^ 12 ) where 

^12 = 1], [ 3 ^ 2 ,^ 2 ]) G 0QX0Q : ra > 0, |xi - Xal = 7r/2| C 


Setting /xo := nJ( 7 L 0 ; 2 ) = ^nJ( 7 L 0 i 2 ), we have /xq = hho < ho; in particular 
Jlo is absolutely continuous with respect to the Lebesgue measure. If we set f{x) : = 
minygsupp (^7 \x — y\, applying |LMS15i Thm. 6.3(b)] we deduce that 

f{x) = Ti 12 for po-a.e. x E fl. 


Applying the co-area formula to /, see |Fed691 Lem. 3.2.34], we have 


X G I 'n/2—e < /(x) < 7r/2-l-e: 


r-K/2+£ 


' 7r/2—e 




for every £ > 0 , 


so that passing to the limit as £ | 0 we get x G | /(x) = 7r/2 }) =0. It follows 

that /xo is the null measure and 7 ( 012 ) = 0- 

Let us now suppose that 7 ^, 7 " G NormOpt^(/io,hi)- 

Combining Theorems 7.2.1 (iv) and 6.7 of [LMS15] (where we use the absolute continu¬ 
ity of /io again) we see that the restrictions of 7 ' and 7 " to 0 coincide. By subtracting this 
common part from both of them and the corresponding homogeneous marginals from /xq 
and /xi, it is not restrictive to assume that 7 ' and 7 " are concentrated in the complement 
of 0. By the previous claim, we obtain that 7 ' and 7 '' are concentrated on 0j U 02 . 

It is also easy to see that the restrictions of 7 ' and 7 " to 0*, x = 1, 2, coincide as 
well: considering e.g. 0 ^, by construction we have that Lljy' = Lljy" = /xi ® 5i, whereas 
n^ 7 ' = 11 ^ 7 " = pii{R^)5o. It follows that 7 'L 0 i = 7 "L 0 'i = (/xi ® ^i) ® do- A similar 
argument holds for 02 . This proves the result. ■ 


The major point in the above proof is to show that 7 ( 012 ) = 0) i-®- there is no 
transport over the distance 7r/2. From Section 5.2 we know that in the opposite case 
NormOpt^(/xo,/xi) can be infinite dimensional. 

We expect that, by refining the arguments above and using the dual characterization 
of hK, it is possible to prove that for each geodesic curve connecting /xq and /xi (both not 
necessarily absolutely continuous w.r.t. there exists a unique 7 G NormOpt^(/xo,/xi) 
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such that /i = As in the Kantorovich-Wasserstein case, the optimal plan 7 should 
be uniquely determined by fixing an intermediate point '}2^{s) with s G ] 0 , 1 [, along a 
geodesic. In particular, geodesics for the Hellinger-Kantorovich distance hK in M(r2) will 
then be nonbranching. Indeed, for the rich set of geodesics connecting two Dirac masses 


at distance 7 r /2 discussed in Section 5.2 this can be shown by direct inspection. We will 
address these questions in a forthcoming paper. 


5.6 hK is not semiconcave 

The metric on a geodesic space {Y, d) is called A'-semiconcave, if for all points yo, yi, y^ eY 
and all minimal geodesic curves y : [ 0 , 1 ] —)■ F with y{i) = y^ for i G { 0 , 1 }, we have 

d{y{s),y*f > {l-s)d{yo,y^f + sd{yi,y^Y - Ks{l-s)d{yo,yif. (5.10) 

It is well-known that the Wasserstein distance on a domain D C is 1-semiconcave 
(such that (CP 2 (f^), W) is a positively curved (PC) space, see jAGSOhl Ch. 12.3]), and it is 
easy to check that the Hellinger-Kakutani distance H is 1-semiconcave. Indeed, with the 
notation of Section m we have the identity 

H(/i^(s),/r*)^ = (l-s)H(/ro,+ sH(/ii ,- s(l-s)H(po, Ai)^ 


for any /io,/ii,/i* G M(r2), where s h->■ G M(r2) is the Hellinger geodesic from (2.8). 

In contrast, the Hellinger-Kantorovich distance is not iP-semiconcave for any K if 
D is not one-dimensional. For the one-dimensional case D C M it is shown in |LMS151 
Thm.8.9] that (M([a, 6 ]), hK) is a PC space, which means that 1-semiconcavity holds. 
The following result shows that (M(r2), hK) is not a PC space if D has dimension d > 2. 
For this case we consider a simple example, namely 

Ao = 4o, = <^xi, y* = b6^, with xq = 0 , xi = fei, z = \ei + ye 2 , 

where ei and 62 are the first two unit vectors and y > 0. As a geodesic curve we choose 

/i(s) = a(s)hp(s)ei with a(s) = ( 1 —s)^ + and p{s) = arctan(s/(l—s)). 

We have l-K(/io,/ii)^ = 2 and /i(l/ 2 ) = |ei, and all the quantities in the semiconcavity 


condition (5.10) can be evaluated explicitly. This yields a lower bound for K, namely 


K > 


|l-K(/io,/i*)^ + 5hK(pi,/r*)2 - l-K(p(|),/r*)" 


iH<(/io,Ai)2 

= 1 + \/b (j){y) with (j){y) = 1 + ^/8 cos^/2(|/) - 4 cos^/2 Y'y'^ + vr^/lO, 

where cos.,r /2 a = cos (min{|a|, 7r/2}) . Since 0(j/) > 0 for y G ]0, A/37r/4[ and since b can 
be chosen arbitrarily large, we see that there cannot exist a finite K such that hK is 
A-semiconcave. 
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